SQL group by issue when i try to get some info - sql

I can not figure it out:
I have a table called ImportantaRecords with fields like market, zip5, MHI, MHV, TheTable and I want to group all the records by zip5 that have TheTable = 'mg'… I tried this :
select a.Market,a.zip5,count(a.zip5),a.MHI,a.MHV,a.TheTable from
(select * from ImportantaRecords where TheTable = 'mg') a
group by a.Zip5
but it gives me the classic error with not an aggrefate function
and then I tried this:
select Market,zip5,count(zip5),MHI,MHV,TheTable from ImportantaRecords where TheTable = 'mg'
group by Zip5
and the same thing…
any help ?

You did not state what database you are using but if you are getting an error about columns not being in an aggregate function, then you might need to add the columns not in an aggregate function to the GROUP BY:
select Market,
zip5,
count(zip5),
MHI,
MHV,
TheTable
from ImportantaRecords
where TheTable = 'mg'
group by Market, Zip5, MHI, MHV, TheTable;
If grouping by the additional columns alters the result that you are expecting, then you could use a subquery to get the result:
select i1.Market,
i1.zip5,
i2.Total,
i1.MHI,
i1.MHV,
i1.TheTable
from ImportantaRecords i1
inner join
(
select zip5, count(*) Total
from ImportantaRecords
where TheTable = 'mg'
group by zip5
) i2
on i1.zip5 = i2.zip5
where i1.TheTable = 'mg'

Related

Query error: Column name ICUSTAY_ID is ambiguous. Using multiple subqueries in BigQuery

Hi, I receive the following query error "Query error: Column name ICUSTAY_ID is ambiguous" referred to the third last line of code (see the following code). Please can you help me? Thank you so much!
I am an SQL beginner..
WITH t AS
(
SELECT
*
FROM
(
SELECT *,
DATETIME_DIFF(CHARTTIME, INTIME, MINUTE) AS pi_recorded
FROM
(
SELECT
*
FROM
(
SELECT * FROM
(SELECT i.SUBJECT_ID, p.dob, i.hadm_id, p.GENDER, a.ETHNICITY, a.ADMITTIME, a.INSURANCE, i.ICUSTAY_ID,
i.DBSOURCE, i.INTIME, DATETIME_DIFF(a.ADMITTIME, p.DOB, DAY) AS age,
CASE
WHEN DATETIME_DIFF(a.ADMITTIME, p.DOB, DAY) <= 32485
THEN 'adult'
WHEN DATETIME_DIFF(a.ADMITTIME, p.DOB, DAY) > 32485
then '>89'
END AS age_group
FROM `project.mimic3.ICUSTAYS` AS i
INNER JOIN `project.mimic3.PATIENTS` AS p ON i.SUBJECT_ID = p.SUBJECT_ID
INNER JOIN `project.mimic3.ADMISSIONS` AS a ON i.HADM_ID = a.HADM_ID)
WHERE age >= 6570
) AS t1
LEFT JOIN
(
SELECT ITEMID, ICUSTAY_ID, CHARTTIME, VALUE, FROM `project.mimic3.CHARTEVENTS`
WHERE ITEMID = 551 OR ITEMID = 552 OR ITEMID = 553 OR ITEMID = 224631
OR ITEMID = 224965 OR ITEMID = 224966
) AS t2
ON t1.ICUSTAY_ID = t2.ICUSTAY_ID
)
)
WHERE ITEMID IN (552, 553, 224965, 224966) AND pi_recorded <= 1440
)
SELECT ICUSTAY_ID #### Query error: Column name ICUSTAY_ID is ambiguous
FROM t
GROUP BY ICUSTAY_ID;
Both t1 and t2 have a column called ICUSTAY_ID. When you join them together into a single dataset you end up with 2 columns with the same name - which obviously can't work as there would be no way of uniquely identify each column.
You need to alias these columns in you code or not include one or the other if you don't need both

NULL id in SQL statement

Select *
from tbl
where id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'
My id column below comes back with a NULL for one of my events. They are related using the matching_id key. Is there a way I can write a case statement to populate that id field so its not NULL?
You can use coalesce() and in case of null id fetch a value (any value?) for the id matching the matching_id like this:
Select
coalesce(
id,
(select max(id) from tbl where matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16')
) as id,
matching_id,
event_name
from tbl
where
id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or
matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'
Edit.
Try this in case of unknown values:
Select
coalesce(
t.id,
(select max(id) from tbl where matching_id = t.matching_id)
) as id,
coalesce(
t.matching_id,
(select max(matching_id) from tbl where id = t.id)
) as matching_id,
t.event_name
from tbl t
Ended up writing something like this..
[snapshot 1
with find_event AS (
Select tbl1. id, tbl1.matching_id
from tbl1
where tbl1.id IS NOT NULL
)
Select
CASE WHEN find_event. id IS NOT NULL
THEN find_event. id
WHEN find_event.id IS NULL
THEN tbl2. id
ELSE tbl2. id
END as final_id,
tbl2.matching_id,
tbl2. id,
tbl2.event_name,
find _event. id as find_canonical_id
from tbl2
left JOIN find_event on find_event.matching_id = tbl2.matching_id
where
tbl2.transaction_canonical_id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or
tbl2.matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'
If you want to replace the column values with the values from your parameters, you could use something like this:
with ids (id, mid) as (
values
('1fa3bcdc9a1cf60f02a2ae774e2cf166', 'ea74c270-65fd-46d0-898b-faf1a7bf7e16')
)
Select coalesce(tbl.id, ids.id),
tbl.matching_id,
tbl.event_name
from tbl
join ids on ids.id = tbl.id or ids.mid = tbl.matching_id;
You may be able to utilize the ISNULL() function for this. Maybe something like:
Select
ISNULL(id, matching_id) AS [myid],
matching_id,
event_name
from tbl
where id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'

How to pivot two rows into two columns

I have the following SQL Query:
select
distinct
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
from
Equipment,
Studies,
Equipment_Reserved
where
Studies.Study = 'MAINT19-01'
and
Equipment.idEquipment = Equipment_Reserved.Equipment_idEquipment
and
Studies.idStudies = Equipment_Reserved.Studies_idStudies
and
Equipment.Type = 'Probe'
This query produces the following results:
Equipment_Attached_To Name
2297 R1-P1
2297 R1-P2
2299 R1-P3
I would like to change it to the following:
Equipment_Attached_To Name1 Name2
2297 R1-P1 R1-P2
2299 R1-P3 NULL
Thanks for your help!
I'd first change your query from the old, legacy JOIN syntax to an explicit join as it makes the query easier to understand:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
I don't think you actually need a PIVOT - I think you can do this with a nested query with the ROW_NUMBER function. I've seen that PIVOT queries often have worse query execution plans than nested-queries.
Let's add ROW_NUMBER (which require an ORDER BY as it's a windowing-function) and a matching ORDER BY in the whole query to make it consistent). Let's also use PARTITION BY so it resets the row-number for each Equipment_Attached_To value:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
ORDER BY
Equipment_Attached_To,
[Name]
This will give output like this:
Equipment_Attached_To Name RowNumber
2297 R1-P1 1
2297 R1-P2 2
2299 R1-P3 1
This can then be split out into explicit columns like so below. The use of MAX() is arbitrary (we could use MIN() instead) and only because we're dealing with a GROUP BY and because the CASE WHEN... restricts the input set to just 1 row anyway.
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
-- the query from above
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2
So the final query is:
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2
Let's start with some basics.
To facilitate reading the code, I added alias to the tables using their initials.
Then, I converted the old join syntax which is partly deprecated to use the standard syntax since 1992 (27 years and people still use the old syntax).
Finally, since there are only 2 possible values, we can use MIN and MAX to separate them in 2 columns.
And because we're using aggregate functions, we remove the DISTINCT and use GROUP BY
The code now looks like this:
SELECT er.Equipment_Attached_To,
--Gets the first row for the id
MIN( e.Name) AS Name1,
--If the MAX is equal to the MIN, returns a NULL. If not, it returns the second value.
NULLIF( MAX(e.Name), MIN( e.Name)) AS Name2
FROM Equipment e
JOIN Studies s ON s.idStudies = er.Studies_idStudies
JOIN Equipment_Reserved er ON e.idEquipment = er.Equipment_idEquipment
WHERE s.Study = 'MAINT19-01'
AND e.Type = 'Probe'
GROUP BY er.Equipment_Attached_To;

Syntax error in FROM clause - MS ACCESS

I am working with a tool that would extract some data from an Access Database. So basically, i am working on a query to get this data.
Below is the code i am currently working on.
I am getting an error: Syntax error in FROM clause
I can't seem to find where the query is going wrong. I would appreciate any help! Thank youu.
EDIT: putting my actual query
SELECT table_freq.*, IIF(table_freq.txn_ctr > (table_ave_freq.ave_freq * 3), "T", "F") as suspicious_flag
FROM
(
SELECT tbl_TransactionHistory.client_num, tbl_TransactionHistory.client_name,
tbl_TransactionHistory.transaction_date, Count(tbl_TransactionHistory.client_num) AS txn_ctr
FROM tbl_TransactionHistory
GROUP BY tbl_TransactionHistory.client_num, tbl_TransactionHistory.client_name,
tbl_TransactionHistory.transaction_date
) AS table_freq
INNER JOIN
(
SELECT table_total_freq.client_num, total_txn_ctr as TotalTransactionFrequency, total_no_days as TotalTransactionDays,
(table_total_freq.total_txn_ctr)/(table_no_of_days.total_no_days) AS ave_freq
FROM
(
(
SELECT client_num, SUM(txn_ctr) AS total_txn_ctr
FROM
(
SELECT client_num, client_name, transaction_date, COUNT(client_num) AS txn_ctr
FROM tbl_TransactionHistory
GROUP BY client_num, client_name, transaction_date
) AS tabFreq
GROUP BY client_num
) AS table_total_freq
INNER JOIN
(
SELECT client_num, COUNT(txn_date) as total_no_days
FROM
(
SELECT DISTINCT(transaction_date) as txn_date, client_num
FROM tbl_TransactionHistory
ORDER BY client_num
) AS table1
GROUP BY client_num
) AS table_no_of_days
ON table_total_freq.client_num = table_no_of_days.client_num
)
) AS table_ave_freq
ON table_freq.client_num = table_ave_freq.client_num

Looking for duplicates based on a few other columns

I am trying to find the rows where PilotID has used the shimpmentNumber more than once.
I have this so far.
select f_Shipment_ID
,f_date
,f_Pilot_ID
,f_Shipname
,f_SailedFrom
,f_SailedTo
,f_d_m
,f_Shipmentnumber
,f_NumberOfPilots
from t_shipment
where f_Pilot_ID < 10000
and f_NumberOfPilots=1
and f_Shipmentnumber in(select f_Shipmentnumber
from t_shipment
group by f_Shipmentnumber
Having count(*) >1)
Try something like this:
-- The CTE determines the f_Pilot_ID/f_Shipmentnumber combinations that appear more than once.
with DuplicateShipmentNumberCTE as
(
select
f_Pilot_ID,
f_Shipmentnumber
from
t_shipment
where
f_Pilot_ID < 10000 and
f_NumberOfPilots = 1
group by
f_Pilot_ID,
f_Shipmentnumber
having
count(1) > 1
)
select
Shipment.f_Shipment_ID,
Shipment.f_date,
Shipment.f_Pilot_ID,
Shipment.f_Shipname,
Shipment.f_SailedFrom,
Shipment.f_SailedTo,
Shipment.f_d_m,
Shipment.f_Shipmentnumber,
Shipment.f_NumberOfPilots
from
-- The join is used to restrict the result set to the shipments identified by the CTE.
t_shipment Shipment
inner join DuplicateShipmentNumberCTE CTE on
Shipment.f_Pilot_ID = CTE.f_Pilot_ID and
Shipment.f_Shipmentnumber = CTE.f_Shipmentnumber
where
f_NumberOfPilots = 1;
You can also do this with a subquery if you want to—or if you're using an old version of SQL Server that doesn't support CTEs—but I find the CTE syntax to be more natural, if only because it enables you to read and understand the query from the top down, rather than from the inside out.
In your sub select use:
select f_Shipmentnumber
from t_shipment
group by f_pilot_id, f_Shipmentnumber
Having count(*) >1
How about this
select f_Shipment_ID
,f_date
,f_Pilot_ID
,f_Shipname
,f_SailedFrom
,f_SailedTo
,f_d_m
,f_Shipmentnumber
,f_NumberOfPilots
from t_shipment
where f_Pilot_ID < 10000
and f_NumberOfPilots=1
and f_Pilot_ID IN (select f_Pilot_ID
from t_shipment
group by f_Pilot_ID, f_Shipmentnumber
Having count(*) >1)