Condensing Data from 4 Columns to 2 - sql

I'm trying to combine two sets of columns down to one set in SQL, where all sets have a common JobID and Date.
I want to take columns FrOpr and BkOpr and condense them down to one Opr field while also take their corresponding FrExtract and BkExtract fields down to one corresponding Extract field.
Any thoughts on how to do this?
All the response are much appreciated. I adapted one of the queries below and used it to create a column of data that I wanted to reference and extract from in a larger query.
The output gives me two columns, an Opr and Extract column. In the larger query, I'm looking to select just values from the new Extract column and then Sum them up as a "Completed" output. My problem is knowing where/how to splice/nest this in to the existing query. Any thoughts on how to do this without creating a temp table? I'll post the larger query I want to add this to
SELECT CONCAT(Operators.OprExtID,'CIREG') AS Processor, Convert(VARCHAR(8), Data.StartDateTime, 112) AS [Processed Date], CONCAT('DEPTRI',Machines.EquipmentType,'',JobTypes.JobTypeDesc,'',Jobs.JobName) AS [Activity Type], SUM(Data.Handled) AS Completed FROM dbo.Operators, dbo.Data DataInput, dbo.jobs jobs, dbo.Machines, dbo.JobTypes WITH (nolock) WHERE (Jobs.ID = Data.JobID AND Data.FrOpr = Operators.Operator AND Data.MachNo = Machines.MachNo AND Data.JobTypeID = JobTypes.JobTypeID)
Processor Processed Date Activity Type Completed 0023390_CIREG 20190116 DEPTRI_LWACS_EXTRACTION_UTGENERAL 43.61 0023390_CIREG 20190116 DEPTRI_MWACS_DOC PREP_AGGEN 7.76 0023390_CIREG 20190116 DEPTRI_SWACS_OPENING_UTGENERAL 808 –

Use UNION
SELECT JobId , Date , FrOpr AS Opr , FrExtract AS Extract
FROM< TableName>
WHERE FrOpr IS NOT NULL
UNION ALL
SELECT JobId , Date , BkOpr AS Opr , BkExtract AS Extract
FROM <TableName>
WHERE BkOpr IS NOT NULL

One option is a CROSS APPLY
Example
Select A.JobID
,A.Date
,B.*
From YourTable A
Cross Apply ( values (FrOpr,FrExtract)
,(BkOpr,BKExtract)
) B(Opr,Extract)

Welcome to Stack Overflow! In the future, please provide sample data and desired results in text form.
This is a pretty simple un-pivot, which I'd do with a Union:
Select
JobId
, Date
, FrOpr as Opr
, FrExtract as Extract
, 'Fr' as Source_Column_Set
From <table_name>
Where <whatever conditions your application requires>
Union
Select
JobId
, Date
, BkOpr as Opr
, BkExtract as Extract
, 'Bk' as Source_Column_Set
From <table_name>
Where <whatever conditions your application requires>
You can make that a CTE and sort the results any way you like.
p.s. I included Source_Column_Set to avoid data loss.

Related

How to join 2 tables that have the values represented differently in each table?

I currently have 2 tables estimate_details and delivery_service.
estimate_details has a column called event that has events such as: checkout, buildOrder
delivery_service has a column called source that has events such as: makeBasket, buildPurchase
checkout in estimate_details is equivalent to makeBasket in delivery_service, and buildOrder is equivalent to buildPurchase.
estimate_details
id
event
...
1
checkout
...
2
buildOrder
...
delivery_service
id
source
date
...
1
makeBasket
'2022-10-01'
...
2
buildPurchase
'2022-10-02'
...
1
makeBasket
'2022-10-20'
...
I would like to be able to join the tables on the event and source columns where checkout = makeBasket and buildOrder = buildPurchase.
Also if there are multiple records for the specific ID and source in delivery_service , choose the latest one.
How would I be able to do this? I cannot UPDATE either table to have the same values as the other table.
I still want all the data from estimate_details, but would like the latest records from the delivery_service.
The Expected output in this situation would be:
id
event
Date
...
1
checkout
'2022-10-20'
...
2
buildOrder
'2022-10-02'
...
The best approach here is to use a CTE, which is like a subquery but more readable.
So first, in the 'CTE' you will use the delivery_service table to get the max date for each id and source. Then, you will handle the 'text' to manually replace it to make it match that in estimate details
WITH delivery_service_cte AS (
SELECT
id
, CASE
WHEN source = 'makeBasket' THEN 'checkout'
WHEN source = 'buildPurchase' THEN 'buildOrder'
END AS source
, MAX(date) AS date
FROM
delivery_service
GROUP BY
1, 2
)
SELECT
ed.* -- select whichever columns you want from here
, ds.id
, ds.source
, ds.date
FROM
estimate_details ed
LEFT JOIN
-- or JOIN (you didn't give enough info on what you are trying to achieve in
-- the output
delivery_service_cte ds
ON ds.source = ed.event

Using UNION ALL to combine two queries into one table

Trying to combine two queries that find the average value of column 'duration_minutes' broken down into two criteria (column 'member_casual' - for which there are only 2 options 'member' or 'casual'. I have been trying a the following syntax, which does display the data that I want, but in two rows, rather than two columns:
SELECT * FROM(
SELECT AVG(duration_minutes) as cas_avg
FROM `case-study-319921.2020_2021_Trip_Data.2020_2021_Rides_Merged`
WHERE member_casual = 'casual'
UNION ALL
SELECT AVG(duration_minutes) as mem_avg
FROM `case-study-319921.2020_2021_Trip_Data.2020_2021_Rides_Merged`
WHERE member_casual = 'member');
Resulting table:
Row
cas_avg
1
40.81073227046788
2
11.345919528176575
How would I combine those to queries so that the result from row 2 would instead display as a column with the header "mem_avg" (the alias that was given in the query)?
How would I combine those to queries so that the result from row 2 would instead display as a column with the header "mem_avg" (the alias that was given in the query)?
try below
SELECT
AVG(IF(member_casual = 'casual', duration_minutes, null) ) as cas_avg,
AVG(IF(member_casual = 'member', duration_minutes, null) ) as mem_avg,
FROM `case-study-319921.2020_2021_Trip_Data.2020_2021_Rides_Merged`
with output
You would use group by:
SELECT member_casual, AVG(duration_minutes) as cas_avg
FROM `case-study-319921.2020_2021_Trip_Data.2020_2021_Rides_Merged`
GROUP BY member_casual;
If there are more than two types, you may need to add:
member_casual in ('casual', 'member')

Postgresql query for every day sold stock count

I have project on CRM which maintains product sales order for every organization.
I want to count everyday sold stock which I have managed to do by looping over by date but obviously it is a ridiculous method and taking more time and memory.
Please help me to find out it in single query. Is it possible?
Here is my database structure for your reference.
product : id (PK), name
organization : id (PK), name
sales_order : id (PK), product_id (FK), organization_id (FK), sold_stock, sold_date(epoch time)
Expected Output for selected month :
organization | product | day1_sold_stock | day2_sold_stock | ..... | day30_sold_stock
http://sqlfiddle.com/#!15/e1dc3/3
Create tablfunc :
CREATE EXTENSION IF NOT EXISTS tablefunc;
Query :
select "proId" as ProductId ,product_name as ProductName,organizationName as OrganizationName,
coalesce( "1-day",0) as "1-day" ,coalesce( "2-day",0) as "2-day" ,coalesce( "3-day",0) as "3-day" ,
coalesce( "4-day",0) as "4-day" ,coalesce( "5-day",0) as "5-day" ,coalesce( "6-day",0) as "6-day" ,
coalesce( "7-day",0) as "7-day" ,coalesce( "8-day",0) as "8-day" ,coalesce( "9-day",0) as "9-day" ,
coalesce("10-day",0) as "10-day" ,coalesce("11-day",0) as "11-day" ,coalesce("12-day",0) as "12-day" ,
coalesce("13-day",0) as "13-day" ,coalesce("14-day",0) as "14-day" ,coalesce("15-day",0) as"15-day" ,
coalesce("16-day",0) as "16-day" ,coalesce("17-day",0) as "17-day" ,coalesce("18-day",0) as "18-day" ,
coalesce("19-day",0) as "19-day" ,coalesce("20-day",0) as "20-day" ,coalesce("21-day",0) as"21-day" ,
coalesce("22-day",0) as "22-day" ,coalesce("23-day",0) as "23-day" ,coalesce("24-day",0) as "24-day" ,
coalesce("25-day",0) as "25-day" ,coalesce("26-day",0) as "26-day" ,coalesce("27-day",0) as"27-day" ,
coalesce("28-day",0) as "28-day" ,coalesce("29-day",0) as "29-day" ,coalesce("30-day",0) as "30-day" ,
coalesce("31-day",0) as"31-day"
from crosstab(
'select hist.product_id,pr.name,o.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),sum(sold_stock)
from sales_order hist
left join product pr on pr.id = hist.product_id
left join organization o on o.id = hist.organization_id
where EXTRACT(MONTH FROM TO_TIMESTAMP(hist.sold_date/1000)) =5
and EXTRACT(YEAR FROM TO_TIMESTAMP(hist.sold_date/1000)) = 2017
group by hist.product_id,pr.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),o.name
order by o.name,pr.name',
'select d from generate_series(1,31) d')
as ("proId" int ,product_name text,organizationName text,
"1-day" float,"2-day" float,"3-day" float,"4-day" float,"5-day" float,"6-day" float
,"7-day" float,"8-day" float,"9-day" float,"10-day" float,"11-day" float,"12-day" float,"13-day" float,"14-day" float,"15-day" float,"16-day" float,"17-day" float
,"18-day" float,"19-day" float,"20-day" float,"21-day" float,"22-day" float,"23-day" float,"24-day" float,"25-day" float,"26-day" float,"27-day" float,"28-day" float,
"29-day" float,"30-day" float,"31-day" float);
Please note, use PostgreSQL Crosstab Query. I have used coalesce for handling null values(Crosstab Query to show "0" when there is null data to return).
Following query will help to find the same:
select o.name,
p.name,
sum(case when extract (day from to_timestamp(sold_date))=1 then sold_stock else 0 end)day1_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=2 then sold_stock else 0 end)day2_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=3 then sold_stock else 0 end)day3_sold_stock,
from sales_order so,
organization o,
product p
where so.organization_id=o.id
and so.product_id=p.id
group by o.name,
p.name;
I just provided logic to find for 3 days, you can implement the same for rest of the days.
basically first do basic joins on id, and then check if each date(after converting epoch to timestamp and then extract day).
You have a few options here but it is important to understand the limitations first.
The big limitation is that the planner needs to know the record size before the planning stage, so this has to be explicitly defined, not dynamically defined. There are various ways of getting around this. At the end of the day, you are probably going to have somethign like Bavesh's answer, but there are some tools that may help.
Secondly, you may want to aggregate by date in a simple query joining the three tables and then pivot.
For the second approach, you could:
You could do a simple query and then pull the data into Excel or similar and create a pivot table there. This is probably the easiest solution.
You could use the tablefunc extension to create the crosstab for you.
Then we get to the first problem which is that if you are always doing 30 days, then it is easy if tedious. But if you want to do every day for a month, you run into the row length problem. Here what you can do is create a dynamic query in a function (pl/pgsql) and return a refcursor. In this case the actual planning takes place in the function and the planner doesn't need to worry about it on the outer level. Then you call FETCH on the output.

Narrowing sql query

I am trying to be more specific on my query as you can see (link to image below) from the cupID column there is groups A to H 3 times. All I am trying to do is have 3 queries, first query to output all groups A-H only once, second query from the second and third from the third if that makes sense?
This is the query
SELECT cupID, date, matchno,
clan1,
clan2,
si
FROM ws_bi2_cup_matches
WHERE ladID='0'
AND matchno = '6'
AND TYPE = 'gs'
GROUP BY clan1
ORDER BY cupID ASC
which shows: (take a look at picture)
http://s13.postimg.org/6rufgywcn/image.png
so query 1/2/3 should output separately like (a,b,c,d etc) instead of 1 query showing multiples (aaa,bbb,ccc,ddd etc)
Many thanks for help
Based on the assumption that you are performing a union all on your 3 queries, please add a dummy column viz SortOrder and order by on it.
In the following sample query (SQL Server), I assumed all 3 queries as same, please do change them accordingly with the dummy sortorder:
-- 1st query
SELECT cupID, date, matchno,
clan1,
clan2,
si,
1 as SortOrder -- dummy sort column
FROM ws_bi2_cup_matches
WHERE ladID='0'
AND matchno = '6'
AND TYPE = 'gs'
GROUP BY clan1
union all
-- 2nd query
SELECT cupID, date, matchno,
clan1,
clan2,
si,
2 as SortOrder -- dummy sort column
FROM ws_bi2_cup_matches
WHERE ladID='0'
AND matchno = '6'
AND TYPE = 'gs'
GROUP BY clan1
union all
-- 3rd query
SELECT cupID, date, matchno,
clan1,
clan2,
si,
3 as SortOrder -- dummy sort column
FROM ws_bi2_cup_matches
WHERE ladID='0'
AND matchno = '6'
AND TYPE = 'gs'
GROUP BY clan1
order by 7 -- dummy sort order column

Aggregate values and Pivot

I am partly on my way to solving this, but have hit a stumbling block, which I think can be solved with pivot(s).
I have the following SQL query, combining two temporary table variables (may change these to temporary tables, as I think performance maybe come a problem as they will be hit a large number of times):
SELECT MeterId, MeterDataOutput.BuildingId, MeterDataOutput.Value,
MeterDataOutput.TimeStamp, UtilityId, SnapshotId
FROM #MeterDataOutput as MeterDataOutput INNER JOIN #InsertOutput AS InsertOutput
ON MeterDataOutput.BuildingId = InsertOutput.BuildingId
AND MeterDataOutput.[Timestamp] = InsertOutput.[TimeStamp]
This produces the following table:
I have then modified the query to group by BuildingId, SnapshotId, Timestamp, Utility and applied the SUM() function to aggregate the Value field (and dropped the MeterId as its not required), as follows:
SELECT MeterDataOutput.BuildingId, SUM(MeterDataOutput.Value) AS Value, MeterDataOutput.TimeStamp, UtilityId, SnapshotId
FROM #MeterDataOutput as MeterDataOutput
INNER JOIN #InsertOutput AS InsertOutput
ON MeterDataOutput.BuildingId = InsertOutput.BuildingId
AND MeterDataOutput.[Timestamp] = InsertOutput.[TimeStamp]
GROUP BY MeterDataOutput.BuildingId, MeterDataOutput.TimeStamp, UtilityId, SnapshotId
This query the provides me with the following table:
Now the bit I'm having trouble with is transforming the UtilityId values to columns, and placing the values from the Value field under each column. I.e:
For reference buildingId, Timestamp, Snapshot and Value are variable. UtilityId value 6 is always 'Electricity', 7 is always 'Gas' and 8 is always 'Water'.
I'm actually starting to get the hand of the SQL lark :)
Maybe something like this:
SELECT
pvt.BuildingId,
pvt.SnapshotId,
pvt.TimeStamp,
pvt.[6] AS Electricity,
pvt.[7] AS Gas,
pvt.[8] AS Water
FROM
(
SELECT
MeterDataOutput.BuildingId,
MeterDataOutput.Value,
MeterDataOutput.TimeStamp,
UtilityId,
SnapshotId
FROM #MeterDataOutput as MeterDataOutput
INNER JOIN #InsertOutput AS InsertOutput
ON MeterDataOutput.BuildingId = InsertOutput.BuildingId
AND MeterDataOutput.[Timestamp] = InsertOutput.[TimeStamp]
) AS SourceTable
PIVOT
(
SUM(Value)
FOR UtilityId IN ([6],[7],[8])
) AS pvt