I have a field in Cosmos DB which is mapped as an number, but it should be a string. I'd like to alter the schema in-place without reloading the data, is this possible with a query in the same way it can be achieved in SQL?
ALTER TABLE EVENTS
MODIFY COLUMN eventAmount varchar;
Have consulted the docs but they only reference simple SQL commands.
DocumentDB is schemaless. There is no structure defined outside documents themselves so each document has their own schema. If you want to enforce some documents follow a certain structure, then that must be enforced by yourself in your application logic.
So, this means you can not "alter schema" for collection to change data types.
What you can and should do, is to fix documents which you consider having wrong schema by updating them. Query docs where eventAmount is stored as JS number and save the document with the value stored as a corresponding javascript string instead.
Related
I am trying to copy tables from one schema to another with the same Azure SQL db. So far, I have created a lookup pipeline and passed the parameters for the for each loop and copy activity. But my sink dataset is not taking the parameter value I have given under "table option" field rather it is taking the dummy table I chose when creating the sink dataset. Can someone tell how can I pass dynamic table name to a sink dataset?
I have given concat('dest_schema.STG_',#{item().table_name})} in the table option field.
To make the schema and table names dynamic, add Parameters to the Dataset:
Most important - do NOT import a schema. If you already have one defined in the Dataset, clear it. For this Dataset to be dynamic, you don't want improper schemas interfering with the process.
In the Copy activity, provide the values at runtime. These can be hardcoded, variables, parameters, or expressions, so very flexible.
If it's the same database, you can even use the same Dataset for both, just provide different values for the Source and Sink.
WARNING: If you use the "Auto-create table" option, the schema for the new table will define any character field as varchar(8000), which can cause serious performance problems.
MY OPINION:
While you can do this, one of my personal rules is to not cross the database boundary. If the Source and Sink are on the same SQL database, I would try to solve this problem with a Stored Procedure rather than a data factory.
I need to get DDL query of a particular object of database like schema, table, column, etc. Is there a way to extract it from system catalog tables using sql?
I tried to find any table in information_schema or pg_catalog with required information, but I didn't find such one.
There is a way to extract it from the system catalogs, but the method depends on what type of object it is, and is not easy.
The "pg_dump" knows how to do it. I would just use that, rather than reinventing things. You can get just the DDL (exclude the data itself) using "-s" option. Then you can fish out the DDL for your specific desired object using your favorite text editor. If the object is a table, you can tell pg_dump to dump just that table, but for other objects you can't.
I am using DataFlow's WriteToBigQuery with CREATE_IF_NEEDED, and thus have to specify the schema.
I define the schema in the beginning of my code (outside the actual pipeline), but since I need the flag --save_main_session, I get the same error as here, which explains that the schema cannot be passed along with the pipeline since a BigQuery schema definition is not pickleable.
The solution mentioned on the page is not an option for me (disable the --save_main_session flag), and thus the other option to specify the schema is through a string.
However, I need to set some fields to REQUIRED. Is there a way to do this with the string schema definition?
As you can see from bigquery.py the conversion from a string schema to a TableSchema is quite straightforward and does indeed set the mode to NULLABLE. Perhaps you can create the TableSchema with REQUIRED fields based on this code snippet.
Every morning, an automatic job creates a new table from an Avro file. In the afternoon, I would need to append some data to this table from a Query.
When trying to do so, I get the following error:
Error: Invalid schema update. Field chn has changed mode from REQUIRED to NULLABLE
I noticed that I can change the property of the field chn from REQUIRED to NULLABLE in the BigQuery Web UI and then it works fine, but I would have to do it manually everyday which is not what I am looking for.
Is there a way to "cast" the field as REQUIRED during the append query ?
Or during the first import from the Avro file, force the field to be NULLABLE and not REQUIRED ?
Thanks !
The feature that allows relaxing a field as part of a query or a load job will be available in production shortly. I will update this answer when it goes live (likely within a week).
Update: 08/25/2016
You can supply schemaUpdateOptions in load or query job configuration.
Multiple options can be provided.
It allows the schema of the destination table to be updated as a side effect of the load or query job. Schema update options are supported in two cases:
When writeDisposition is WRITE_APPEND
When writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators
For non-partitioned tables, WRITE_TRUNCATE will always overwrite the schema.
The following values are supported:
ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema
ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable
NOTE: This doesn't currently work with schema auto-detection. We plan to support that soon.
I have a SQL Server view CyclesList on the table Cycle. Cycle table contains a few columns, and CyclesList view add some more data that can be computed on database level.
And now, I have a NHibernate mapping that points to CyclesList:
<class name="Cycle" table="CyclesList">
However, I would still like to work with Cycle class, and perform Create/Update operations , but I have to use stored procedure that will access Cycle table directly. Is there a way to achieve it in NHibernate? I would appriciate a sample mapping/links to resources with samples. Thanks
You find some information in the docs under "Native-Sql -> Custom SQL for create, update and delete". Basically, you need the "sql-insert", "sql-delete" and "sql-update" elements in the mapping file.
There is also an example on Ayendes blog.