Add a date loaded field when uploading csv to big query
Using Python.
Is there any way to add an extra field while processing a csv file to Big Query.
I'd like to add a date_loaded field with the current date ?
Google code example I have used ..
# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'my_dataset'
dataset_ref = client.dataset(dataset_id)
job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField('name', 'STRING'),
bigquery.SchemaField('post_abbr', 'STRING')
]
job_config.skip_leading_rows = 1
# The source format defaults to CSV, so the line below is optional.
job_config.source_format = bigquery.SourceFormat.CSV
uri = 'gs://cloud-samples-data/bigquery/us-states/us-states.csv'
load_job = client.load_table_from_uri(
uri,
dataset_ref.table('us_states'),
job_config=job_config) # API request
print('Starting job {}'.format(load_job.job_id))
load_job.result() # Waits for table load to complete.
print('Job finished.')
destination_table = client.get_table(dataset_ref.table('us_states'))
print('Loaded {} rows.'.format(destination_table.num_rows))
python google-bigquery
add a comment |
Using Python.
Is there any way to add an extra field while processing a csv file to Big Query.
I'd like to add a date_loaded field with the current date ?
Google code example I have used ..
# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'my_dataset'
dataset_ref = client.dataset(dataset_id)
job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField('name', 'STRING'),
bigquery.SchemaField('post_abbr', 'STRING')
]
job_config.skip_leading_rows = 1
# The source format defaults to CSV, so the line below is optional.
job_config.source_format = bigquery.SourceFormat.CSV
uri = 'gs://cloud-samples-data/bigquery/us-states/us-states.csv'
load_job = client.load_table_from_uri(
uri,
dataset_ref.table('us_states'),
job_config=job_config) # API request
print('Starting job {}'.format(load_job.job_id))
load_job.result() # Waits for table load to complete.
print('Job finished.')
destination_table = client.get_table(dataset_ref.table('us_states'))
print('Loaded {} rows.'.format(destination_table.num_rows))
python google-bigquery
1
Does date partitioned tables work for you? If not, maybe a better approach would be to use apache beam instead. If still it doesnt work, then only way out I see is to bring this data to local, iterate over it and add the date field. If you are working with lots of data this is not recommended though.
– Willian Fuks
Nov 21 '18 at 15:16
1
..or load it into a staging/tmp table in BigQuery then hit it with SQL, and add thedate_loaded
field as part of that SQL transform. Write the results to your main table. If you use ingestion based partition table, just be aware that its's in UTC unless you address the partition directly (cloud.google.com/bigquery/docs/…)
– Graham Polley
Nov 22 '18 at 12:15
add a comment |
Using Python.
Is there any way to add an extra field while processing a csv file to Big Query.
I'd like to add a date_loaded field with the current date ?
Google code example I have used ..
# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'my_dataset'
dataset_ref = client.dataset(dataset_id)
job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField('name', 'STRING'),
bigquery.SchemaField('post_abbr', 'STRING')
]
job_config.skip_leading_rows = 1
# The source format defaults to CSV, so the line below is optional.
job_config.source_format = bigquery.SourceFormat.CSV
uri = 'gs://cloud-samples-data/bigquery/us-states/us-states.csv'
load_job = client.load_table_from_uri(
uri,
dataset_ref.table('us_states'),
job_config=job_config) # API request
print('Starting job {}'.format(load_job.job_id))
load_job.result() # Waits for table load to complete.
print('Job finished.')
destination_table = client.get_table(dataset_ref.table('us_states'))
print('Loaded {} rows.'.format(destination_table.num_rows))
python google-bigquery
Using Python.
Is there any way to add an extra field while processing a csv file to Big Query.
I'd like to add a date_loaded field with the current date ?
Google code example I have used ..
# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'my_dataset'
dataset_ref = client.dataset(dataset_id)
job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField('name', 'STRING'),
bigquery.SchemaField('post_abbr', 'STRING')
]
job_config.skip_leading_rows = 1
# The source format defaults to CSV, so the line below is optional.
job_config.source_format = bigquery.SourceFormat.CSV
uri = 'gs://cloud-samples-data/bigquery/us-states/us-states.csv'
load_job = client.load_table_from_uri(
uri,
dataset_ref.table('us_states'),
job_config=job_config) # API request
print('Starting job {}'.format(load_job.job_id))
load_job.result() # Waits for table load to complete.
print('Job finished.')
destination_table = client.get_table(dataset_ref.table('us_states'))
print('Loaded {} rows.'.format(destination_table.num_rows))
python google-bigquery
python google-bigquery
asked Nov 21 '18 at 12:49
mez63
309
309
1
Does date partitioned tables work for you? If not, maybe a better approach would be to use apache beam instead. If still it doesnt work, then only way out I see is to bring this data to local, iterate over it and add the date field. If you are working with lots of data this is not recommended though.
– Willian Fuks
Nov 21 '18 at 15:16
1
..or load it into a staging/tmp table in BigQuery then hit it with SQL, and add thedate_loaded
field as part of that SQL transform. Write the results to your main table. If you use ingestion based partition table, just be aware that its's in UTC unless you address the partition directly (cloud.google.com/bigquery/docs/…)
– Graham Polley
Nov 22 '18 at 12:15
add a comment |
1
Does date partitioned tables work for you? If not, maybe a better approach would be to use apache beam instead. If still it doesnt work, then only way out I see is to bring this data to local, iterate over it and add the date field. If you are working with lots of data this is not recommended though.
– Willian Fuks
Nov 21 '18 at 15:16
1
..or load it into a staging/tmp table in BigQuery then hit it with SQL, and add thedate_loaded
field as part of that SQL transform. Write the results to your main table. If you use ingestion based partition table, just be aware that its's in UTC unless you address the partition directly (cloud.google.com/bigquery/docs/…)
– Graham Polley
Nov 22 '18 at 12:15
1
1
Does date partitioned tables work for you? If not, maybe a better approach would be to use apache beam instead. If still it doesnt work, then only way out I see is to bring this data to local, iterate over it and add the date field. If you are working with lots of data this is not recommended though.
– Willian Fuks
Nov 21 '18 at 15:16
Does date partitioned tables work for you? If not, maybe a better approach would be to use apache beam instead. If still it doesnt work, then only way out I see is to bring this data to local, iterate over it and add the date field. If you are working with lots of data this is not recommended though.
– Willian Fuks
Nov 21 '18 at 15:16
1
1
..or load it into a staging/tmp table in BigQuery then hit it with SQL, and add the
date_loaded
field as part of that SQL transform. Write the results to your main table. If you use ingestion based partition table, just be aware that its's in UTC unless you address the partition directly (cloud.google.com/bigquery/docs/…)– Graham Polley
Nov 22 '18 at 12:15
..or load it into a staging/tmp table in BigQuery then hit it with SQL, and add the
date_loaded
field as part of that SQL transform. Write the results to your main table. If you use ingestion based partition table, just be aware that its's in UTC unless you address the partition directly (cloud.google.com/bigquery/docs/…)– Graham Polley
Nov 22 '18 at 12:15
add a comment |
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412407%2fadd-a-date-loaded-field-when-uploading-csv-to-big-query%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412407%2fadd-a-date-loaded-field-when-uploading-csv-to-big-query%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Does date partitioned tables work for you? If not, maybe a better approach would be to use apache beam instead. If still it doesnt work, then only way out I see is to bring this data to local, iterate over it and add the date field. If you are working with lots of data this is not recommended though.
– Willian Fuks
Nov 21 '18 at 15:16
1
..or load it into a staging/tmp table in BigQuery then hit it with SQL, and add the
date_loaded
field as part of that SQL transform. Write the results to your main table. If you use ingestion based partition table, just be aware that its's in UTC unless you address the partition directly (cloud.google.com/bigquery/docs/…)– Graham Polley
Nov 22 '18 at 12:15