inserting data from another project table in BigQuery - sql

I've created two table , "conversion_log_" on project1 and "test_table" on project2 in Google bigQuery.
I have been trying to select and insert data from conversion_log_ to test_table . I want to transfer orderid(STRING) of conversion_log_ whose pgid and luid matches pgid and luid in test_table but I got this error "Unrecognized name: hitobito_test at [6:10]" . I'm sure my tablename is correct.
I can't get the cause of this error . Could anyone tell me ??
sorry I'm beginner at BigQuery so if I oversight something , Please let me know .
insert into hitobito_test.test_table(orderid)
select orderid
from
`kuzen-198289.conversion_log.conversion_log_` as p
where
p.pgid = hitobito_test.test_table.pgid
AND
p.luid = hitobito_test.test_table.luid
test_table
pgid | luid | cv_date | orderid
4587 | U2300 | null | null
4444 | U7777 | null | null
conversion_log_
pgid | luid | cv_date | orderid |
3232 | U5454 | 2020-08-01 | xcdf23
9786 | U3745 | 2020-08-02 | fgtd43
4587 | U2300 | 2020-08-02 | aaav3 ⬅︎ I need to send this orderid to the first line in test_table
If I add prijectname like below , I got this message
"Syntax error: Expected end of input but got identifier "hitobito_test" at [6:33]"
insert into galvanic-ripsaw-281806.hitobito_test.test_table(orderid)
select orderid
from
`kuzen-198289.conversion_log.conversion_log_` as p
where
p.pgid = galvanic-ripsaw-281806.hitobito_test.test_table.pgid
AND
p.luid = hitobito_test.test_table.luid
`

Please try this:
INSERT INTO `galvanic-ripsaw-281806.hitobito_test.test_table`(orderid)
SELECT
orderid
FROM
`kuzen-198289.conversion_log.conversion_log_` AS p
WHERE
EXISTS (
SELECT 1
FROM
`galvanic-ripsaw-281806.hitobito_test.test_table` h
WHERE
p.pgid = h.pgid AND p.luid = h.luid)

Related

How to get a value inside of a JSON that is inside a column in a table in Oracle sql?

Suppose that I have a table named agents_timesheet that having a structure like this:
ID | name | health_check_record | date | clock_in | clock_out
---------------------------------------------------------------------------------------------------------
1 | AAA | {"mental":{"stress":"no", "depression":"no"}, | 6-Dec-2021 | 08:25:07 |
| | "physical":{"other_symptoms":"headache", "flu":"no"}} | | |
---------------------------------------------------------------------------------------------------------
2 | BBB | {"mental":{"stress":"no", "depression":"no"}, | 6-Dec-2021 | 08:26:12 |
| | "physical":{"other_symptoms":"no", "flu":"yes"}} | | |
---------------------------------------------------------------------------------------------------------
3 | CCC | {"mental":{"stress":"no", "depression":"severe"}, | 6-Dec-2021 | 08:27:12 |
| | "physical":{"other_symptoms":"cancer", "flu":"yes"}} | | |
Now I need to get all agents having flu at the day. As for getting the flu from a single JSON in Oracle SQL, I can already get it by this SQL statement:
SELECT * FROM JSON_TABLE(
'{"mental":{"stress":"no", "depression":"no"}, "physical":{"fever":"no", "flu":"yes"}}', '$'
COLUMNS (fever VARCHAR(2) PATH '$.physical.flu')
);
As for getting the values from the column health_check_record, I can get it by utilizing the SELECT statement.
But How to get the values of flu in the JSON in the health_check_record of that table?
Additional question
Based on the table, how can I retrieve full list of other_symptoms, then it will get me this kind of output:
ID | name | other_symptoms
-------------------------------
1 | AAA | headache
2 | BBB | no
3 | CCC | cancer
You can use JSON_EXISTS() function.
SELECT *
FROM agents_timesheet
WHERE JSON_EXISTS(health_check_record, '$.physical.flu == "yes"');
There is also "plain old way" without JSON parsing only treting column like a standard VARCHAR one. This way will not work in 100% of cases, but if you have the data in the same way like you described it might be sufficient.
SELECT *
FROM agents_timesheet
WHERE health_check_record LIKE '%"flu":"yes"%';
How to get the values of flu in the JSON in the health_check_record of that table?
From Oracle 12, to get the values you can use JSON_TABLE with a correlated CROSS JOIN to the table:
SELECT a.id,
a.name,
j.*,
a."DATE",
a.clock_in,
a.clock_out
FROM agents_timesheet a
CROSS JOIN JSON_TABLE(
a.health_check_record,
'$'
COLUMNS (
mental_stress VARCHAR2(3) PATH '$.mental.stress',
mental_depression VARCHAR2(3) PATH '$.mental.depression',
physical_fever VARCHAR2(3) PATH '$.physical.fever',
physical_flu VARCHAR2(3) PATH '$.physical.flu'
)
) j
WHERE physical_flu = 'yes';
db<>fiddle here
You can use "dot notation" to access data from a JSON column. Like this:
select "DATE", id, name
from agents_timesheet t
where t.health_check_record.physical.flu = 'yes'
;
DATE ID NAME
----------- --- ----
06-DEC-2021 2 BBB
Note that this approach requires that you use an alias for the table name (so you can use it in accessing the JSON data).
For testing I used the data posted by MT0 on dbfiddle. I am not a big fan of double-quoted column names; use something else for "DATE", such as dt or date_.

SQLite - Filtering query with complex jointure table

Good afternoon everyone,
I work on a project which uses a SQLite3 database and it is generated with Doctrine (ORM in PHP).
Underground station table contains all stations in Paris:
CREATE TABLE underground_station (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
long_name VARCHAR(255) NOT NULL
);
Line table contains all lines in Paris:
CREATE TABLE line (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
commercial_name VARCHAR(255) NOT NULL );
This table associates the metro lines according to the serving station:
CREATE TABLE line_association (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
underground_station_id INT NOT NULL,
line_id INT NOT NULL,
is_terminus BOOL NOT NULL,
CONSTRAINT fk_association_underground_station FOREIGN KEY (underground_station_id) REFERENCES underground_station(id),
CONSTRAINT fk_association_line FOREIGN KEY (line_id) REFERENCES line(id) );
I have a query to returns underground station name, the lines served there and whether it is a terminus:
SELECT u.long_name, group_concat(l.commercial_name) as "lines", la.is_terminus
FROM underground_station u
JOIN line_association la on u.id = la.underground_station_id
JOIN line l on la.line_id = l.id
GROUP BY u.id;
Query result:
+-------------------------+------------------+-------------+
|long_name | lines | is_terminus |
+-------------------------+------------------+-------------+
|CHARLES DE GAULLE ETOILE | M6,M2,M1 | 0 |
+-------------------------+------------------+-------------+
|CHATEAU DE VINCENNES | M1 | 1 |
+-------------------------+------------------+-------------+
|CONCORDE | M12,M1,M8 | 0 |
+-------------------------+------------------+-------------+
|FRANKLIN-D.ROOSEVELT | M9,M | 0 |
+-------------------------+------------------+-------------+
|LA DEFENSE-GRANDE ARCHE | M1 | 1 |
+-------------------------+------------------+-------------+
|NATION | M2,M9,M6,M1 | 0 |
+-------------------------+------------------+-------------+
|CHATELET | M14,M1,M7,M11,M4 | 0 |
+-------------------------+------------------+-------------+
This query is working perfecly. My question is how to returns same data which contains only underground stations with his services when I select a specific underground line like 'M1' ?
I have found this possibility but I have wrong data beacause "connections" returns always returns "1" even though undergroupd stations have 2 or more connections:
SELECT underground_station.long_name,
(SELECT count(line_id)
FROM line_association
GROUP BY underground_station_id
HAVING count(line_id)) AS "connections",
is_terminus
FROM underground_station
JOIN line_association la on underground_station.id = la.underground_station_id
JOIN line l on la.line_id = l.id
WHERE l.commercial_name = 'M1';
Query result:
+-------------------------+-------------+-------------+
|long_name | connections | is_terminus |
+-------------------------+-------------+-------------+
|CHARLES DE GAULLE ETOILE | 1 | 0 |
+-------------------------+-------------+-------------+
|CHATEAU DE VINCENNES | 1 | 1 |
+-------------------------+-------------+-------------+
|CONCORDE | 1 | 0 |
+-------------------------+-------------+-------------+
|FRANKLIN-D.ROOSEVELT | 1 | 0 |
+-------------------------+-------------+-------------+
|LA DEFENSE-GRANDE ARCHE | 1 | 1 |
+-------------------------+-------------+-------------+
|NATION | 1 | 0 |
+-------------------------+-------------+-------------+
|CHATELET | 1 | 0 |
+-------------------------+-------------+-------------+
I have try with "LIKE" condition, but the results contains M14, M13, M12, M11 lines when I try to found only M1 underground stations and his connections.
I have try also "instr(lines, 'M1')"
but it only returns the data linked to the "M1" underground line.
Do you have any idea how to get the correct values ​​when I filter by underground line?
I think you want a HAVING clause:
SELECT u.long_name, group_concat(l.commercial_name) as "lines", la.is_terminus
FROM underground_station u
JOIN line_association la on u.id = la.underground_station_id
JOIN line l on la.line_id = l.id
GROUP BY u.id
HAVING MAX(l.commercial_name = 'M1') = 1

Validate if value exists before insert row

I have a query where I send a TableType who have columns EmpKey and TaskId like:
#AssignNotificationTableType [dbo].[udf_TaskNotification] READONLY
INSERT INTO [TaskNotification] ([TaskId], [EmpKey])
SELECT
[ANT].[TaskId], [E].[EmpKey]
FROM
#AssignNotificationTableType AS [ANT]
INNER JOIN
[Employee] AS [E] ON [ANT].[EmpGuid] = [E].[EmpGuid]
So my table looks like this:
+--------------------------------------+--------------------------------------+--------+
| TaskNotificationId | TaskId | EmpKey |
+--------------------------------------+--------------------------------------+--------+
| EEE3D3F8-F190-E811-841F-C81F66DACA6A | D0440DEB-404C-4006-870F-E95BFFA840E0 | 44 |
| EFE3D3F8-F190-E811-841F-C81F66DACA6A | D0440DEB-404C-4006-870F-E95BFFA840E0 | 49 |
+--------------------------------------+--------------------------------------+--------+
As you can see two items have same TaskId but different Empkey, so suppose if I send again same TaskId D0440DEB-404C-4006-870F-E95BFFA840E0 I want to insert only row only if EmpKey does not exist in that TaskId
So if I send something like:
+--+--------------------------------------+--------+
| | TaskId | EmpKey |
+--+--------------------------------------+--------+
| | D0440DEB-404C-4006-870F-E95BFFA840E0 | 44 |
| | D0440DEB-404C-4006-870F-E95BFFA840E0 | 49 |
| | D0440DEB-404C-4006-870F-E95BFFA840E0 | 54 |
+--+--------------------------------------+--------+
It will only insert last row, because EmpKey 54 does not exist in that TaskId
I try to do in WHERE clause with NOT IN as:
INSERT INTO [TaskNotification] ([TaskId], [EmpKey])
SELECT
[ANT].[TaskId], [E].[EmpKey]
FROM
#AssignNotificationTableType AS [ANT]
INNER JOIN
[Employee] AS [E] ON [ANT].[EmpGuid] = [E].[EmpGuid]
WHERE
[E].[EmpKey] NOT IN (SELECT EmpKey
FROM [TaskNotification]
WHERE TaskId = (SELECT TaskId
FROM #AssignNotificationTableType))
But when I run it, it just don't insert anything. What am I doing wrong? Regards
Add the target table as a left join to the select statement:
INSERT INTO [TaskNotification]
(
[TaskId]
, [EmpKey]
)
SELECT
[ANT].[TaskId]
, [E].[EmpKey]
FROM #AssignNotificationTableType AS [ANT]
INNER JOIN [Employee] AS [E] ON [ANT].[EmpGuid] = [E].[EmpGuid]
LEFT JOIN [TaskNotification] AS [TN] ON [TN].[TaskId] = [ANT].[TaskId]
AND [TN].[EmpKey] = [E].[EmpKey]
WHERE [TN].[PK] IS NULL -- PK stands for the primary key column
-- (or first column in of a multiple columns pk)
Please note, however, this in a multithreaded environment such query might fail - For more information, read this SO post and Dan Guzman's blog post it links to.

Select rows with same value in one column but different value in another column

I've been trying to build this query but am new to SQL so I'd really appreciate some help.
In the below table example, I have a Customer Code, a linked Customer Code (which is used to link a child customer to a parent customer), a salesperson, and other irrelevant columns. The goal is to have one Salesperson for each parent customer and it's children. So in the example, CustCode #100 is the parent of itself, #200, #500, and #800. All of these accounts have the same Salesperson (JASON) which is perfect. But for CustCode #300, it is the parent of itself, #400, and #600. However, there isn't one salesperson assigned - its both JIM and SUZY. I want to build a query that shows all accounts for this example. Basically, accounts where the Salesperson field isn't the same value for all of it's child customers.
I tried a Where clause for Salesperson <> Salesperson but its not showing up right.
+-----------+-----------------+------------+----------------------+
| CustCode | Linked CustCode | Salesperson| additional columns...|
+-----------+-----------------+------------+----------------------+
| 100 | 100 | JASON | ... |
| 200 | 100 | JASON | ... |
| 300 | 300 | JIM | ... |
| 400 | 300 | JIM | ... |
| 500 | 100 | JASON | ... |
| 600 | 300 | SUZY | ... |
| 700 | NULL | JIM | ... |
| 800 | 100 | JASON | ... |
+-----------+-----------------+------------+----------------------+
Thanks so much for your help!
You can do self join on the table.
select distinct r2.* from
table r1
join table r2
on
r1.linkedcustcode = r2.linkedcustcode and r1.salesperson <> r2.salesperson
This solution uses a recursive CTE first to build the hierarchy and find the leading code for each row, even if a linked code points to a row which is pointing to an upper row itself.
The final query shows the count of different Salespersons:
DECLARE #tbl TABLE(CustCode INT,[Linked CustCode] INT,Salesperson VARCHAR(100));
INSERT INTO #tbl VALUES
(100,100,'JASON')
,(200,100,'JASON')
,(300,300,'JIM')
,(400,300,'JIM')
,(500,100,'JASON')
,(600,300,'SUZY')
,(700,NULL,'JIM')
,(800,100,'JASON');
--The query
WITH CleanUp AS
(
SELECT CustCode
,CASE WHEN [Linked CustCode]=CustCode THEN NULL ELSE [Linked CustCode] END AS [Linked CustCode]
,Salesperson
FROM #tbl
)
,recCTE AS
(
SELECT CustCode AS LeadingCode,CustCode,[Linked CustCode],Salesperson
FROM CleanUp
WHERE [Linked CustCode] IS NULL
UNION ALL
SELECT recCTE.LeadingCode,t.CustCode,t.[Linked CustCode],t.Salesperson
FROM recCTE
INNER JOIN CleanUp AS t ON t.[Linked CustCode]=recCTE.CustCode
)
SELECT LeadingCode,COUNT(DISTINCT Salesperson) AS CountSalesperson
FROM recCTE
GROUP BY LeadingCode
The result
LeadingCode CountSalesperson
100 1
300 2
700 1

Assistance with SQL multi-table query - returning duplicate results

We use an online project management system, and I'm trying to extend it somewhat.
It has the following tables of interest:
todo_itemStatus:
+--------------+-----------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------------------+----------------+
| itemStatusId | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| itemId | int(10) unsigned | NO | MUL | 0 | |
| statusDate | datetime | NO | | 0000-00-00 00:00:00 | |
| statusKey | tinyint(3) unsigned | NO | | 0 | |
| memberId | mediumint(8) unsigned | NO | | 0 | |
+--------------+-----------------------+------+-----+---------------------+----------------+
This table keeps track of when a task is complete, and also keeps the status of all task changes.
There's then a project table, and an 'item' (or task) table.
I basically want to be able to extract a list of projects, with details on the percentage of tasks complete. However, for now I'd be happy if I could just list each task in a project with details on whether they're complete.
As far as I'm aware, the best way to get the most recent status of a task is to choose an todo_itemStatus where the statusDate is the newest, or the itemStatusId is the largest whilst itemId equals the task I'm interested.
I tried a query like this:
<pre>
select todo_item.itemId, todo_item.title, todo_itemStatus.statusKey, todo_itemStatus.statusDate
from todo_item, todo_project, todo_itemStatus
where todo_item.projectId = todo_project.projectId
and todo_project.projectId = 13
and todo_itemStatus.itemId = todo_item.itemId
and todo_itemStatus.statusDate = (
select MAX(todo_itemStatus.statusDate)
from todo_itemStatus key1 where todo_itemStatus.itemId = key1.itemId);
</pre>
However, this yields all status updates with output like this:
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
| itemId | title | statusKey | statusDate |
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
| 579 | test complete item - delete me | 1 | 2009-07-28 13:04:38 |
| 579 | test complete item - delete me | 0 | 2009-07-28 14:12:12 |
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
Which isn't what I want, as I only want one task entry returning with the statusKey / statusDate from the most recent entry in the todo_itemStatus table.
I know I've been a bit vague in my description, but I didn't want to write a massively long message. I can provide much more detail if necessary.
Please can someone suggest what I'm doing wrong? It's been a long time since I've done any real database stuff, so I'm a bit unsure what I'm doing wrong here...
Many thanks!
Dave
You should look into using the DISTINCT keyword (Microsoft SQL Server)
EDIT: I've just re-read your question and I think that the GROUP BY clause is more suited in this situation. You should read http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/, however essentially what you need to do is first select the columns that you are interested in using a GROUP BY clause:
SELECT todo_itemStatus.itemStatusId, MAX(todo_itemStatus.statusDate)
FROM todo_item, todo_project, todo_itemStatus
WHERE todo_item.projectId = todo_project.projectId
AND todo_itemStatus.itemId = todo_item.itemId
AND todo_project.projectId = 13
GROUP BY itemStatusId
We then self-join to this set of id's to get the rest of the columns we are interested in:
SELECT
todo_item.itemId,
todo_item.title,
todo_itemStatus.statusKey,
todo_itemStatus.statusDate
FROM todo_item
JOIN todo_itemStatus
ON todo_itemStatus.itemId = todo_item.itemId
JOIN
(SELECT todo_itemStatus.itemStatusId, MAX(todo_itemStatus.statusDate)
FROM todo_item, todo_project, todo_itemStatus
WHERE todo_item.projectId = todo_project.projectId
AND todo_itemStatus.itemId = todo_item.itemId
AND todo_project.projectId = 13
GROUP BY itemStatusId) AS x
ON todo_itemStatus.itemStatusId = x.itemStatusId
I've experimented some more and the following query does what I want:
select todo_item.itemId, todo_item.title, todo_itemStatus.statusKey, todo_itemStatus.statusDate from todo_itemStatus, todo_item where todo_item.itemId = todo_itemStatus.itemId and todo_item.projectId = 13 and todo_itemStatus.statusDate = (select MAX(status.statusDate) from todo_itemStatus as status where status.itemId = todo_item.itemId);
So I'm now happy. Thanks for all the help and the suggestions.
Dave.