I have two fact tables : Budget and vente .
I create a link table to store them but I get loops between tables as it's shown in the following picture .
How to remove loops ?
Related
I hope I explain this adequately.
I have a series of Google Sheets with data from an Airtable database. Several of the fields are stringified arrays with recordIds to another table.
These fields can have between 0 and n - comma separated values.
I run a create/overwrite table SELECT statement to create native BigQuery tables for reporting. This works great.
Now I need to add the recordIds to a Repeated field.
I've manually written to a repeated field using:
INSERT INTO `robotic-vista-339622.Insurly_dataset.zzPOLICYTEST` (policyID, locations, carrier)
VALUES ('12334556',[STRUCT('recordId1'),STRUCT('recordId2')], 'name of policy');
However, I need to know how I to do this using SELECT statement rather than INSERT. I also need to know how to do this if you do not know the number of recordIds that have been retrieved from Airtable. One record could have none and another record could have 10 or more.
Any given sheet will look like the following, where "locations" contains the recordIds I want to add to a repeated field.
SHEETNAME: POLICIES
|policyId |carrier | locations |
|-----------|-----------|---------------------------------|
|recrTkk |Workman's | |
|rec45Yui |Workman's |recL45x32,recQz70,recPrjE3x |
|recQb17y |ABC Co. |rec5yUlt,recIrW34 |
In the above, the first row/record has no location Id's. And then three and two on the subsequent rows/records.
Any help is appreciated.
Thanks.
I'm unsure if answering my own question is the correct way to show that it was solved... but here is what it took.
I create a Native table in BigQuery. the field for locations is a string, mode repeated.
Then I just run an overwrite table SELECT statement.
SELECT recordId,Name, Amount, SPLIT(locations) as locations FROM `projectid.datasetid.googlesheetsdatatable`;
Tested and I run linked queries on the locations with unnest.
I created 2 summary tables form the same source data for different date ranges.
Now that I have these multiple summary tables, I want to put those tables together
so that I will be able to run a summary on the combined table.
It's creating the summary table that is presenting the problem.
scratch.table_1 has 809,598 records.
scratch.table_2 has 1,228,176 records.
They both have the same set of fields from the source table,
plus a "record_number" field I created on each table using count(1).
The code I used to put these two tables together was:
create table scratch.table_1_and_2
select * from scratch.table_1
union all
select * from scratch.table_2
I assumed that there would be 809,598 + 1,228,176 records in the new table (2,037,774 records).
But there are only 1,960,769 records in the new table.
What am i doing wrong?
One way to troubleshoot would be to identify some of the missing records and see what might be different about the data in those that would cause them to be left out. A UNION ALL should include duplicate records so duplicates shouldn't be the issue. Maybe there is some data issue that's causing those records to be dropped. Also I'm assuming there isn't any funny business with Views going on in the underlying tables and that no data loads are affecting your record counts.
I have two schemas and a table in each schema. Table is called "concept". One Schema is "cdm" schema and other is "vocab" schema.
I would like to copy records from "concept" table of "vocab" schema to "concept" table of "cdm" schema based on certain conditions
concept table has a primary key called concept_id and several other non primary columns.
Ex: vocab.concept has 200 records. I would like to copy only records that satisfy a condition like concept_id > 150
Currently the concept table in CDM schema (cdm.concept) is filled with 150 records. So, I would like to add/insert the remaining 50 records (which is concept_id > 150)
Can you let me know how can I do this copy or kind of ignore duplicates and copy only new records (ie newly added 50 records)
INSERT INTO cdm.concept(concept_id,Name,Age,Info)
SELECT concept_id,Name,Age,Info
FROM vocab.concept
where concept_id > 150;
Questions
Is this above sql correct?
Is it the only way because my table might have more than 25 columns and I don't wish to type in column names every time I wish to copy.
Please note that table already exists in cdm schema. So, I don't wish to use "Create Table" approach
I have the following problem.
I have a table Entries that contains 2 columns:
EntryID - unique identifier
Name - some name
I have another EntriesMapping table (many to many mapping table) that contains 2 columns :
EntryID that refers to the EntryID of the Entries table
PartID that refers to a PartID in a seprate Parts table.
I need to write a SP that will return all data from Entries table, but for each row in the Entries table I want to provide a list of all PartID's that are registered in the EntriesMapping table.
My question is how do I best approach the deisgn of the solution to this, given that the results of the SP would regularly be processed by an app so performance is quite important.
1.
Do I write a SP that will select multiple rows per entry - where if there are more than one PartID's registered for a given entry - I will return multiple rows each having the same EntryID and Name but different PartID's
OR
2.
Do I write a SP that will select 1 row per entry in the Entries table, and have a field that is a string/xml/json that contains all the different PartID's.
OR
3. There is some other solution that I am not thinking of?
Solution 1 seems to me to be the better way to go, but I will be passing lots of repeating data.
Solution 2 wont pass extra data, but the string/json/xml would need to be processed additionally, resuling in larger cpu time per item.
PS: I feel like this is quite a common problem to solve, but I was unable to find any resource that can provide common solutions or some pros/cons to different approaches.
I think you need simple JOIN:
SELECT e.EntryId, e.Name, em.PartId
FROM Entries e
JOIN EntriesMapping em ON e.EntryId = em.EntryId
This will return what you want, no need for stored procedure for that.
Trying to create relationships (joins) between tables in power pivot.
Got 2 tables I wold like to join together, connected with a common column = CustomerID.
One is a Fact Table the other Dim table (look up).
I have run the "remove duplicates" on both tables without any problem.
But I still get an error saying : "the relationship cannot be created because each column contains duplicate values. Select at least one column that contains only unique values".
The Fact Table contains duplicates (as it should?) and the Dim Table do not, why do I get this error?
Help much appreciated
Created an appended table with both columns "CustomerID". After the columns where appended together I could "remove duplicates" and connect the tables together through the newly created appended table.
Don't know if this causes another problem later however.
You can also check for duplicate id values in a column by using the group by feature.
Remove all columns except ID, add a column that consists only of the number 1.
Group by ID, summing the content of the added column and filter out IDs whose total equals 1. What's left are duplicated IDs.