I would like to simplify my data with a view table, MainView but am having a hard time figuring it out.
I have a Fact table that is specific to clients, language, and status. The ID in the Fact table comes from a FactLink table that just has an FactLinkID column. The Status table has an Order column that needs to be shown in the aggregate view instead of the StatusID. My Main table references the Fact table in multiple columns.
The end goal will be to be able to query the view table by the compound index of LanguageID, StatusOrder, ClientID more simply than I was before, grabbing the largest specified StatusOrder and the specified ClientID or ClientID 1. So, that is what I was hoping to simplify with the view table.
So,
Main
ID | DescriptionID | DisclaimerID | Other
----+---------------+--------------+-------------
50 | 1 | 2 | Blah
55 | 4 | 3 | Blah Blah
Fact
FactID | LanguageID | StatusID | ClientID | Description
-------+------------+----------+----------+------------
1 | 1 | 1 | 1 | Some text
1 | 2 | 1 | 1 | Otro texto
1 | 1 | 3 | 2 | Modified text
2 | 1 | 1 | 1 | Disclaimer1
3 | 1 | 1 | 1 | Disclaimer2
4 | 1 | 1 | 1 | Some text 2
FactLink
ID
--
1
2
3
4
Status
ID | Order
---+------
1 | 10
2 | 100
3 | 20
MainView
MainID | StatusOrder | LanguageID | ClientID | Description | Disclaimer | Other
-------+-------------+------------+----------+---------------+-------------+------
50 | 10 | 1 | 1 | Some text | Disclaimer1 | Blah
50 | 10 | 2 | 1 | Otro texto | NULL | Blah
50 | 20 | 1 | 2 | Modified text | NULL | Blah
55 | 10 | 1 | 1 | Some text 2 | Disclaimer2 | Blah Blah
Here's how I implemented it with just a single column that references the Fact table:
DROP VIEW IF EXISTS dbo.KeywordView
GO
CREATE VIEW dbo.KeywordView
WITH SCHEMABINDING
AS
SELECT t.KeywordID, f.ClientID, f.Description Keyword, f.LanguageID, s.[Order] StatusOrder
FROM dbo.Keyword t
JOIN dbo.Fact f
ON f.FactLinkID = t.KeywordID
JOIN dbo.Status s
ON f.StatusID = s.StatusID
GO
CREATE UNIQUE CLUSTERED INDEX KeywordIndex
ON dbo.KeywordView (KeywordID, ClientID, LanguageID, StatusOrder)
My previous query queried for everything except for that StatusOrder. But adding in the StatusOrder seems to complicate things. Here's my previous query without the StatusOrder. When I created a view on a table with just a single Fact linked column it greatly simplified things, but extending that to two or more columns has proven difficult!
SELECT
Main.ID,
COALESCE(fDescription.Description, dfDescription.Description) Description,
COALESCE(fDisclaimer.Description, dfDisclaimer.Description) Disclaimer,
Main.Other
FROM Main
LEFT OUTER JOIN Fact fDescription
ON fDescription.FactLinkID = Main.DescriptionID
AND fDescription.ClientID = #clientID
AND fDescription.LanguageID = #langID
AND fDescription.StatusID = #statusID -- This actually needs to get the largest `StatusOrder`, not the `StatusID`.
LEFT OUTER JOIN Fact dfDescription
ON dfDescription.FactLinkID = Main.DescriptionID
AND dfDescription.ClientID = 1
AND dfDescription.LanguageID = #langID
AND dfDescription.StatusID = #statusID
... -- Same for Disclaimer
WHERE Main.ID = 50
Not sure if this the most performant or elegant way to solve this problem. But I finally thought of a way to do it. The problem with the solution below is that it can no longer be indexed. So, now to figure out how to do that without having to wrap it in a derived table.
SELECT
x.ID,
x.StatusOrder,
x.LanguageID,
x.ClientID,
x.Other,
MAX(x.Description),
MAX(x.Disclaimer)
FROM (
SELECT
Main.ID,
s.StatusOrder,
f.LanguageID,
f.ClientID,
f.Description,
NULL Disclaimer,
Main.Other
FROM Main
JOIN Fact f
ON f.FactID = Main.DescriptionID
JOIN Status s ON s.StatusID = f.StatusID
UNION ALL
SELECT
Main.ID,
s.StatusOrder,
f.LanguageID,
f.ClientID,
NULL Description,
f.Description Disclaimer,
Main.Other
FROM Main
JOIN Fact f
ON f.FactID = Main.DisclaimerID
JOIN Status s ON s.StatusID = f.StatusID
) x
GROUP BY x.ID, x.StatusOrder, x.LanguageID, x.ClientID, x.Other
Related
I have a Production Table and a Standing Data table. The relationship of Production to Standing Data is actually Many-To-Many which is different to how this relationship is usually represented (Many-to-One).
The standing data table holds a list of tasks and the score each task is worth. Tasks can appear multiple times with different "ValidFrom" dates for changing the score at different points in time. What I am trying to do is query the Production Table so that the TaskID is looked up in the table and uses the date it was logged to check what score it should return.
Here's an example of how I want the data to look:
Production Table:
+----------+------------+-------+-----------+--------+-------+
| RecordID | Date | EmpID | Reference | TaskID | Score |
+----------+------------+-------+-----------+--------+-------+
| 1 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 2 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 3 | 30/02/2020 | 1 | 123 | 1 | 2 |
| 4 | 31/02/2020 | 1 | 123 | 1 | 2 |
+----------+------------+-------+-----------+--------+-------+
Standing Data
+----------+--------+----------------+-------+
| RecordID | TaskID | DateActiveFrom | Score |
+----------+--------+----------------+-------+
| 1 | 1 | 01/02/2020 | 1.5 |
| 2 | 1 | 28/02/2020 | 2 |
+----------+--------+----------------+-------+
I have tried the below code but unfortunately due to multiple records meeting the criteria, the production data duplicates with two different scores per record:
SELECT p.[RecordID],
p.[Date],
p.[EmpID],
p.[Reference],
p.[TaskID],
s.[Score]
FROM ProductionTable as p
LEFT JOIN StandingDataTable as s
ON s.[TaskID] = p.[TaskID]
AND s.[DateActiveFrom] <= p.[Date];
What is the correct way to return the correct and singular/scalar Score value for this record based on the date?
You can use apply :
SELECT p.[RecordID], p.[Date], p.[EmpID], p.[Reference], p.[TaskID], s.[Score]
FROM ProductionTable as p OUTER APPLY
( SELECT TOP (1) s.[Score]
FROM StandingDataTable AS s
WHERE s.[TaskID] = p.[TaskID] AND
s.[DateActiveFrom] <= p.[Date]
ORDER BY S.DateActiveFrom DESC
) s;
You might want score basis on Record Level if so, change the where clause in apply.
I'm using Firebase data exported to BigQuery (data contains events data coming from mobile application). I've made an update to the application and new parameter is being reported. Unfortunately, not all users have the latest version of app. This is why I have rows with that parameter as well as rows without it.
In event_params I have something like:
| No | contentId | contentName |
|----|-----------|---------------------|
| 1 | abc | (parameter missing) |
| 2 | abc | Name of ABC |
| 3 | cde | Name of CDE |
| 4 | efg | Name of EFG |
| 5 | abc | (parameter missing) |
| 6 | cde | Name of CDE |
Now, when I query that table and I specify (using UNNEST) that I need contentName parameter, I don't get rows where that parameter is missing.
I have query:
SELECT
ep.value.string_value as ContentID,
ep2.value.string_value as ContentName,
COUNT(1) as `Count`
FROM
`mydataset.mytable.events_*`,
UNNEST(event_params) as ep,
UNNEST(event_params) as ep2
WHERE
event_name="my_event_name" AND
ep.key="contentID" AND
ep2.key="contentName"
GROUP BY 1,2
and I get:
| No | contentId | contentName | Count |
|----|-----------|-------------|-------|
| 1 | abc | Name of ABC | 1 |
| 2 | cde | Name of CDE | 2 |
| 3 | efg | Name of EFG | 1 |
However, I would like to get:
| No | contentId | contentName | Count |
|----|-----------|-------------|-------|
| 1 | abc | Name of ABC | 3 |
| 2 | cde | Name of CDE | 2 |
| 3 | efg | Name of EFG | 1 |
I want to complete somehow rows with missing contentName parameters using values from other rows with the same contentId (we can assume that each contentId has the same, constant contentName)
How can I achieve it? I thougt about SELF JOIN, but it's rather not recommended by BigQuery.
The solution provided by Gordon can be slightly modified in order to achieve what you intend:
SELECT contentId.value.string_value as ContentID,
MAX(contentName.value.string_value) as ContentName,
COUNT(1) as `Count`
FROM `mydataset.mytable.events_*` e LEFT JOIN
UNNEST(e.event_params) as contentId
ON contentId.key = 'contentID' LEFT JOIN
UNNEST(e.event_params) contentName
ON contentName.key = 'contentName'
WHERE e.event_name = 'my_event_name'
GROUP BY 1;
Note that I am grouping only by the ContentID and I am aggregating the ContentNames using MAX, which ignores null values.
I have recreated your example table and it works as expected.
You can update the table so that you fill the nulls and then make your query
[1]
UPDATE `your_project.your_dataset.your_table` t_incomplete
SET t_incomplete.contentName = t_complete.contentName
FROM `your_project.your_dataset.your_table` t_complete
WHERE t_incomplete.contentId = t_complete.contentId
AND t_complete.contentName IS NOT NULL
I am not sure how will this work with nested tables but you can always
UPDATE UNNESTING
UPDATE WITH QUERY [1]
UPDATE NESTING
You can picture the idea behind with this sample CREATE TABLE
CREATE TABLE `your_project.your_dataset.sample_table`
(
id INT64,
nullable STRING
);
INSERT INTO `your_project.your_dataset.sample_table`
VALUES (1, 'foo');
INSERT INTO `your_project.your_dataset.sample_table`
VALUES (1, null);
INSERT INTO `your_project.your_dataset.sample_table`
VALUES (2, 'lel');
INSERT INTO `your_project.your_dataset.sample_table`
VALUES (1, null);
INSERT INTO `your_project.your_dataset.sample_table`
VALUES (2, null);
and QUERY[2]
UPDATE `your_project.your_dataset.sample_table` t_incomplete
SET t_incomplete.nullable = t_complete.nullable
FROM `wave27-sellbytel-aalbesa.trial_dataset.with_and_update` t_complete
WHERE t_incomplete.id = t_complete.id
AND t_complete.nullable IS NOT NULL
This way you actually give the corresponding value to the cell and you can run your query without worries. I hope this works!
Do you just need an OR condition?
WHERE event_name = 'my_event_name' AND
ep.key = 'contentID' AND
(ep2.key = 'contentName' OR ep2.key IS NULL)
EDIT:
I think you need LEFT JOINs:
SELECT contentId.value.string_value as ContentID,
contentName.value.string_value as ContentName,
COUNT(1) as `Count`
FROM `mydataset.mytable.events_*` e LEFT JOIN
UNNEST(e.event_params) as contentId
ON contentId.key = 'contentID' LEFT JOIN
UNNEST(e.event_params) contentName
ON contentName.key = 'contentName'
WHERE e.event_name = 'my_event_name'
GROUP BY 1, 2;
Note: This should preserve the counts you want but might result in extra rows in the result set.
I want to get data in a single row from two tables which have one to many relation.
Primary table
Secondary table
I know that for each record of primary table secondary table can have maximum 10 rows. Here is structure of the table
Primary Table
-------------------------------------------------
| ImportRecordId | Summary |
--------------------------------------------------
| 1 | Imported Successfully |
| 2 | Failed |
| 3 | Imported Successfully |
-------------------------------------------------
Secondary table
------------------------------------------------------
| ImportRecordId | CodeName | CodeValue |
-------------------------------------------------------
| 1 | ABC | 123456A |
| 1 | DEF | 8766339 |
| 1 | GHI | 887790H |
------------------------------------------------------
I want to write a query with inner join to get data from both table in a way that from secondary table each row should be treated as column instead showing as multiple row.
I can hard code 20 columns names(as maximum 10 records can exist in secondary table and i want to display values of two columns in a single row) so if there are less than 10 records in the secondary table all other columns will be show as null.
Here is expected Output. You can see that for first record in primary table there was only three rows that's why two required columns from these three rows are converted into columns and for all others columns values are null.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ImportRecordId | Summary | CodeName1 | CodeValue1 | CodeName2 | CodeValue2 | CodeName3 | CodeValue3 | CodeName4 | CodeValue4| CodeName5 | CodeValue5| CodeName6 | CodeValue6| CodeName7 | CodeValue7 | CodeName8 | CodeValue8 | CodeName9 | CodeValue9 | CodeName10 | CodeValue10|
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | Imported Successfully | ABC | 123456A | DEF | 8766339 | GHI | 887790H | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Here is my simple SQL query which return all data from both tables but instead multiple rows from secondary table i want to get them in a single row like above result set.
Select p.ImportRecordId,p.Summary,s.*
from [dbo].[primary_table] p
inner join [dbo].[secondary_table] s on p.ImportRecordId = s.ImportRecordId
The following uses Row_Number(), a JOIN and a CROSS APPLY to create the source of the PIVOT
You'll have to add the CodeName/Value 4...10
Example
Select *
From (
Select A.[ImportRecordId]
,B.Summary
,C.*
From (
Select *
,RN = Row_Number() over (Partition by [ImportRecordId] Order by [CodeName])
From Secondary A
) A
Join Primary B on A.[ImportRecordId]=B.[ImportRecordId]
Cross Apply (values (concat('CodeName' ,RN),CodeName)
,(concat('CodeValue',RN),CodeValue)
) C(Item,Value)
) src
Pivot (max(value) for Item in (CodeName1,CodeValue1,CodeName2,CodeValue2,CodeName3,CodeValue3) ) pvt
Returns
ImportRecordId Summary CodeName1 CodeValue1 CodeName2 CodeValue2 CodeName3 CodeValue3
1 Imported Successfully ABC 123456A DEF 8766339 GHI 887790H
I have three tables :
BookingNode , Booking AirTrip
AirTrip :
+----+------------+
| ID | Name |
+----+------------+
| 0 | One way |
| 1 | Round trip |
| 2 | Circle |
| 3 | Other |
+----+------------+
When ever we make a booking we store the data as :
BookingNode table
+--------+-------------------+------------+----------------------+
| ID | CustomerGivenName | IPAddress | Email |
+--------+-------------------+------------+----------------------+
| 177022 | xfghfh | 2130706473 | mikehussey#gmail.com |
| 177021 | cfggjfj | 2130706473 | mikehussey#gmail.com |
+--------+-------------------+------------+----------------------+
Booking Table :
+--------+---------------+-----------+------------+------------+
| ID | BookingNodeID | AirTripID | AirLineId | Provider |
+--------+---------------+-----------+------------+------------+
| 181251 | 177020 | 1 | 978 | Jet |
| 181252 | 177021 | 0 | 982 | Go |
| 181253 | 177021 | 0 | 978 | Jet |
+--------+---------------+-----------+------------+------------+
If round trip flight is booked and ProviderID is same then a single entry is done in Booking Table with AirTripID value as 1.(Booking ID : 181251 and Provider Jet )
But if providers are different for both the legs then two entries are done in Booking Table with AirTripID for both entries are one(Booking ID : 181252 and 181253 Provider Go,Jet ).In this case BookingNodeID value being same.
Prob : I have to write a query to get different type of Bookings.(Oneway, RoundTrip,Circle).But when I apply join on AirTripID , it is giving me incorrect results.
How can I write my query to give correct results knowing that BookingNodeID is going to be the same for roundtrip (both entries in Booking Table)
Sample Output
+-------------+---------------+-------------------+------------+
| AirTripName | BookingNodeID | CustomerGivenName | IPAddress |
+-------------+---------------+-------------------+------------+
| TwoWay | 177020 | xfghfh | 2130706473 |
| TwoWay | 177021 | cfggjfj | 2130706473 |
+-------------+---------------+-------------------+------------+
Basically, this code might have an error due to my laziness syntom of data entry. But, the logic of the query is, if b.AirTripID is 0, add extra condition which group by Booking. if result return more than 1 row, is actually 2 way. so AirTripType will become 1, otherwise, remain the same as b.AirTripID. You may copy below on and try fix if theres any error. i believe the logic should work based on your expected result.
select
bd.ID,
bd.CustomerGivenName,
case b.AirTripID
when 1 then 1
when 2 then 2
when 3 then 3
when 0 then
case select BookingNodeID
from Booking
where Booking.BookingNodeID = bd.ID group by BookingNodeID having Count(BookingNodeID)
when 1 then 1
else 0 end as AirTripType,
bd.IPAddress
from BookingNode bd
inner join (select BookingNodeID ,AirTripID from Booking group by BookingNodeID ,AirTripID) as b on b.BookingNodeID = bd.ID
where id=177021
Try This
WITH CTE
AS
(
SELECT
SeqNo = ROW_NUMBER() OVER(PARTITION BY BN.ID ORDER BY B.ID),
B.BookingNodeID,
BN.CustomerGivenName,
BN.IPAddress,
AirTripId = A.ID,
AirTripNm = A.Name
FROM Booking B
INNER JOIN AirTrip A
ON A.ID = B.AirTripID
LEFT JOIN BookingNode BN
ON B.BookingNodeID = BN.id
)
SELECT
C1.SeqNo,
AirTripName = CASE WHEN C2.SeqNo IS NOT NULL
THEN 'Round trip'
ELSE C1.AirTripNm END,
C1.BookingNodeID,
C1.CustomerGivenName,
C1.IPAddress
FROM CTE C1
LEFT JOIN CTE C2
ON C1.BookingNodeID = C2.BookingNodeID
AND C2.SeqNo = 2
WHERE c1.SeqNo = 1
SQL Fiddle Link Here
Select distinct bk.bookingnodeid,cst.customername,ipaddress,
case when count(airtripid)over(partition by bookingnodeid order by bookingnodeid)=2 then 'RoundTrip' else name end As AirTripName
from booking bk
inner join airlinetrip at
on bk.airtripid=at.id
inner join customer cst
on cst.id=bk.bookingnodeid
http://sqlfiddle.com/#!4/24637/1
I have three tables, (better details/data shown in sqlfiddle link), one replacing another, and a cross reference table in between. One of the fields in each of the table uses the cross reference (version), and another one of the fields in each of the tables is the same (changeID).
I need a query that when passed a list of new_version + new_changeType, along with the equivalent original_version + old_changeType (if there is an old version equivalent) PLUS any old changeIDs that were 'missed' in the conversion of data.
TABLES (fields on the same line are equivalent)
OLD_table | XREF_table | NEW_Table
original_version | original_version |
changeID | | changeID
OLD_changeType | |
| new_version | new_version
| | NEW_changeType
DATA
111,1,CT1 | 111,AAA | AAA,1,ONE
111,2,CT2 | 222,BBB | AAA,2,TWO
222,1,CT1 | 333,DDD | BBB,1,ONE
222,2,CT2 | | BBB,2,TWO
222,3,CT3 | | CCC,1,ONE
333,1,CT1 | |
444,1,CT1 | |
If passed the following list, the result set should look like so. (order doesnt matter)
AAA,BBB,CCC
| NEW_VERSION | NEW_CHANGE_TYPE| ORIGINAL_VERSION | CHANGEID | OLD_CHANGE_TYPE |
|-------------|----------------|------------------|----------|-----------------|
| AAA | ONE | 111 | 1 | CT1 |
| AAA | TWO | 111 | 2 | CT2 |
| BBB | ONE | 222 | 1 | CT1 |
| BBB | TWO | 222 | 2 | CT2 |
| CCC | ONE | (null) | (null) | (null) |
| (null) | (null) | 222 | 3 | CT3 |
I'm having trouble getting ALL the data required. I've played with the following query, however I seem to either 1) miss a row or 2) get additional rows not matching the requirements.
The following queries I've played with are as follows.
select
a.new_version,
a.Change_type,
c.original_version,
c.changeID,
c.OLD_Change_type
from NEW_TABLE a
LEFT OUTER JOIN XREF_TABLE b on a.new_version = b.new_version
FULL OUTER JOIN OLD_TABLE c on
b.original_version = c.original_version and a.changeID = c.changeID
where (b.new_version in ('AAA','BBB','CCC') or b.new_version is null);
select
a.new_version,
a.Change_type,
c.original_version,
c.changeID,
c.OLD_Change_type
from NEW_TABLE a
FULL JOIN XREF_TABLE b on a.new_version = b.new_version
FULL JOIN OLD_TABLE c on
b.original_version = c.original_version and a.changeID = c.changeID
where (a.new_version in ('AAA','BBB','CCC'));
The first returns one 'extra' row with the 333,DDD data, which is not specified from the input.
The seconds returns one less row (with the changeID from the old table "missed" from when this data was converted over.
Any thoughts or suggestions on how to solve this?
First inner join old_table and xref_table, as you are not interested in any old_table entries without an xref_table entry. Then full outer join new_table. In your WHERE clause be aware that new_table.new_version can be null, so use coalesce to use xref_table.new_version in this case to limit your results to AAA, BBB and CCC. That's all.
select
coalesce(n.new_version, x.new_version) as new_version,
n.change_type,
o.original_version,
o.changeid,
o.old_change_type
from old_table o
inner join xref_table x
on x.original_version = o.original_version
full outer join new_table n
on n.new_version = x.new_version
and n.changeid = o.changeid
where coalesce(n.new_version, x.new_version) in ('AAA','BBB','CCC')
order by 1,2,3,4,5
;
Here is your fiddle: http://sqlfiddle.com/#!4/24637/11.
BTW: Better never use random aliases like a, b and c that don't indicate what table is meant. That makes the query harder to understand. Use the table's first letter(s) or an acronym instead.