If you are lazy person like us, you will likely want to use Autodetect option for detecting the data types:
This is a very helpful option when your table has e. g. 100 columns and you don’t have time to say which column is which data type (DATE, DATETIME, STRING, INTEGER…). You basically let BigQuery to do it’s magic to assign proper data types – so what looks likes a string will be STRING, what looks like an integer will be INTEGER and so on. Sound great, right?
Well…there are a couple potential issues:
- Your data needs to be very coherent because BigQuery is guessing the data types based on the first X rows. So if it sees 50 integers in the column, it will set the column to INTEGER. However, if there is suddenly a STRING value in the column, the upload will fail because BigQuery already expects INTEGER.
- Your date columns need to be formatted in a supported format. Read more here.
- Your datetime columns need to be formatted in a supported format. Read more here.
- The columns are uploaded into BigQuery in a random order. So if your table has these columns: A | B | C | D | E in Google Sheet, it will end up being uploaded as D | C | B | E | A for example. We are not sure why BigQuery does this but it’s not in our control. The random order is just a visual issue when previewing the data in BigQuery. It has no impact functionality in BigQuery.
So what’s the conlusion?
If you have nice coherent data with properly set date and datetime formats, feel free to use Autodetect, otherwise consider using Manual or All Columns as STRING option.