Managed Table
This is the Default Table Type
Data will be copied into the /user/hive/warehouse
folder (UHW) inside the database folder
Generated Columns values will be calculated automatically if we do not provide a value
External Table
Unmanaged Tables are create when the location/ path to the data is specified by the user
When new data is added into external table we need to perform “REFRESH” operation to see the latest data
Exception: JDBC Tables show up as managed
Data objects in the Databricks Lakehouse | Databricks on AWS
CTAS (Create Table As Select)
Does not support specifying OPTIONS
The schema of the data is inferred from the output of the query
For formats that are not self describing like CSV we can use a query that allows defining schema (Temp View/ Table) as required before loading
Generated Column
Generated columns are columns who’s value is calculated using the value of another column
When a new value is inserted into the table the value of the generated column is calculated automatically
If we manually try to insert value for the generated column it should be an exact match else the query will fail
They are a special type of CHECK constraint