I am importing raw data in groovy, hundreds of thousands of entries. I use the table fields as the keys of a hash map and then use the add(hash) method on a groove sql Dataset. The Dataset goes to a postgres table and the ID field is autogenerated from a sequence. I need to get the ID of each record as it is inserted.
In java + hibernate the ID gets inserted automatically inside the corresponding field of the object being persisted. In this case the add() method does not return anything nor does it add an id field to the hash table. I am trying to avoid using hibernate/gorm here for efficiency reason.
Thanks for any pointers or for a better approach.
groovy.sql.SQL has an executeInsert() query which returns a list of the auto-generated column values for each inserted row.
Related
I want to try and create a boolean table with the same structure as another table. I know how to create the table but my issue is the updating.
Lets say i have the table A1 with 10 columns with different attributes for a person such as, height, run speed, name, hair colour etc.
I then want to be able to modifiy this table by either removing or adding columns to table A1 and these updates apply to my other column B1 so it has the same columns but a boolean value (the boolean value is not based on A1).
My first question is if it's doable.
My second is: Will the updates be super ineffecient for lets say 200-300 records.
(I could probably create an external program that reads the table and manually removes and adds columns via ADD/DROP sql statements, but i was hoping there was a more dynamic/efficent solution)
What you want, as another answer posted, is EAV schema "entity - attribute - value". This allows you to dynamically add new attributes without changing any physical table schema. It is also horrible for performance (but with only a few hundred entities it shouldn't be too bad).
Another equally ugly solution is to add as many columns as you think you'll ever need, named Attribute_1, Attribute_2, etc. Then you have a lookup table which allows you to map attributes to their definitions.
This is less flexible than the EAV schema, but allows you to index on specific attributes so that your queries are a little more performant.
Another solution would be to use XML data types to hold the attributes and values. SQL Server has built-in functionality for XML data, while it's not as easy to use as normal SQL, it does work quite well.
Rather than add and subtract columns to the table. I would suggest that you have a table with the fixed attributes. Then have another table which stores additional attributes (the names of the columns) then a third table which holds the id of the person, the is off the attribute and the value of the attribute.
For example the user table :
UserId
Firstname
Surname
The attribute table
AttrId
AttrName
The UserAttribute table:
UserId
AttrId
AttrValue
For this to answer your question you could have two sets of these tables but the AttrValue would be boolean for the second table.
An intermediate option is to go for multiple spare columns in the table and use the attribute table to store a column name and a boolean to indicate if the column is in use
I have a huge BQ table with a complex schema (lots of repeated and record fields). Is there a way for me to add more columns to this table and/or create a select that would copy the entire table into a new one with the addition of one (or more) columns? It appears as if copying a table requires flattening of repeated columns (not good). I need an exact copy of the original table with some new columns.
I found a way to Update Table Schema but it looks rather limited as I can only seem to add nullable or repeated columns. I can't add record columns or remove anything.
If I were to modify my import JSON data (and schema) I could import anything. But my import data is huge and conveniently already in a denormalized gzipped JSON so changing that seems like a huge effort.
I think you can add fields of type RECORD.
Nullable and repeated refer to field's mode, not type. So you can add a Nullable record or a Repeated record, but cannot add a Required record.
https://cloud.google.com/bigquery/docs/reference/v2/tables#resource
You are correct that you cannot delete anything.
If you want to use a query to copy the table, but don't want nested and repeated fields to be flattened, you can set the flattenResults parameter to false to preserve the structure of your output schema.
I am interested in using the SQLDataAdapter with the DataTable and associated Insert/Update/Delete Command operations that I can attach to the Adapter object. My question is this. Does each row in the datatable used necessarily need to correspond to any one physical table ? What I would like to be able to do is allow a single row to represent columns that span multiple tables and then craft each of the insert/update commands to handle their operations across these tables. That would mean that what I assign to the command might actually be a more complex sql statement even wrapped in BEGIN/END so that I can insert into the first "anchor" table then use that primary key and for the foreign key column of the subsequent column.
So far all the examples I see relate to each data table representing a single table. I realize that I could perhaps use a dataset but then how would I attach a command relative to each data table within the set. Furthermore how then could I relate the rows from table to the rows of the dhild table.?
Anyone try this ?
You could create a View with an instead of insert trigger. Within the trigger you can split the columns as you like and do multiple inserts to different tables.
Say i have an entity with an auto generated primary key. Now if i try to save the entity with values of all other fields which may not be unique.
The entity gets auto populated with the id of the row got inserted. How did it get hold of that primary key value?
EDIT:
If the primary key column is say identity column whose value is totally decided by the database. So it does an insert statement without that column value and the db decides the value to use does it communicate back its decision (I dont think so)
Hibernate use three method for extracting the DB auto generated field depending on what is support by the jdbc driver or the dialect you are using.
Hibernate extract generated field value to put it back in the pojo :
Using the method Statement.getGeneratedKeys (Statement javadocs)
or
Inserting and selecting the generated field value directly from the insert statement. (Dialect Javadocs)
or
Executing a select statement after the insert to retrieve the generated IDENTITY value
All this is done internally by hibernate.
Hope it`s the explication you are looking for.
This section of the Hibernate documentation describes the auto generation of ids. Usually the AUTO generation strategy is used for maximum portability and assuming that you use Annotations to provide your domain metadata you can configure it as follows:
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private long id;
Anyway the supplied link should provide all the detail you need on generated ids.
When you create an object with the, say, sequence-derived surrogate primary key, you pass it to the Hibernate session with that field set to the value that Hibernate interprets as "not assigned", by default 0. This field is not populated with the assigned value until the corresponding record is not inserted into the database table. You can trigger insertion by either explicitly calling flush() on the hibernate session or performing a database read in the same session. After that you can check the value of that field and it will be assigned rather than 0.
Say I'm mapping a simple object to a table that contains duplicate records and I want to allow duplicates in my code. I don't need to update/insert/delete on this table, only display the records.
Is there a way that I can put a fake (generated) ID column in my mapping file to trick NHibernate into thinking the rows are unique? Creating a composite key won't work because there could be duplicates across all of the columns.
If this isn't possible, what is the best way to get around this issue?
Thanks!
Edit: Query seemed to be the way to go
The NHibernate mapping makes the assumption that you're going to want to save changes, hence the requirement for an ID of some kind.
If you're allowed to modify the table, you could add an identity column (SQL Server naming - your database may differ) to autogenerate unique Ids - existing code should be unaffected.
If you're allowed to add to the database, but not to the table, you could try defining a view that includes a RowNumber synthetic (calculated) column, and using that as the data source to load from. Depending on your database vendor (and the products handling of views and indexes) this may face some performance issues.
The other alternative, which I've not tried, would be to map your class to a SQL query instead of a table. IIRC, NHibernate supports having named SQL queries in the mapping file, and you can use those as the "data source" instead of a table or view.
If you're data is read only one simple way we found was to wrapper the query in a view and build the entity off the view, and add a newguid() column, result is something like
SELECT NEWGUID() as ID, * FROM TABLE
ID then becomes your uniquer primary key. As stated above this is only useful for read-only views. As the ID has no relevance after the query.