Converting rows to columns using multiple pivot in sql - sql

I have written a query to convert rows into columns using multiple pivot functions with respect to months 4, 5 & 6. I did succeed in converting the rows into columns. Below is the query:
(SELECT [team],
Count_Of_OrderId,
Count_Of_OId,
Avg_a,
[Count_of_u] ,
convert(varchar(max),[month_from_Date])+'_COID' as
month_from_Date_COAID,
convert(varchar(max),[month_from_Date]) + '_COID' as
month_from_Date_CODID,
convert(varchar(max),[month_from_Date])+'_Avg_a' as
month_from_Date_Avg_a,
convert(varchar(max),[month_from_Date])+'_Count_of_u' as
month_from_Date_Count_of_u
FROM [MyTable]) AS S
PIVOT
(
MAX(Count_Of_OrderId,)
FOR [month_from_Date_COAID] IN ([4_COID], [5_COID], [6_COID])
) AS PivotTable1
PIVOT
(
MAX(Count_Of_OId)
FOR [month_from_Date_CODID] IN ([4_COID], [5_COID], [6_COID])
) AS PivotTable2
PIVOT
(
MAX(Avg_a)
FOR [month_from_Date_Avg_a] IN ([4_Avg_a], [5_Avg_a], [6_Avg_a])
) AS PivotTable3
PIVOT
(
MAX(Count_of_users)
FOR [month_from_Date_Count_of_u] IN ([4_Count_of_u], [5_Count_of_u],
[6_Count_of_u])
) AS PivotTable4
So the output was:
+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------+--------------+--------------+
| Team | COAID_4 | COAID_5 | COAID_6 | CODID_4 | CODID_5 | CODID_6 | Avg_a_4 | Avg_a_5 | Avg_a_6 | Count_of_u_4 | Count_of_u_5 | Count_of_u_6 |
+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------+--------------+--------------+
| Team A | NULL | NULL | 17 | NULL | NULL | 15 | NULL | NULL | 1.13 | NULL | NULL | 7 |
| Team A | NULL | 14 | NULL | NULL | 14 | NULL | NULL | 1 | NULL | NULL | 6 | NULL |
| Team A | 9 | NULL | NULL | 7 | NULL | NULL | 1.29 | NULL | NULL | 5 | NULL | NULL |
| Team B | NULL | NULL | 12159 | NULL | NULL | 6482 | NULL | NULL | 1.88 | NULL | NULL | 40 |
| Team B | NULL | 14287 | NULL | NULL | 6525 | NULL | NULL | 2.19 | NULL | NULL | 39 | NULL |
| Team B | 15822 | NULL | NULL | 7117 | NULL | NULL | 2.22 | NULL | NULL | 40 | NULL | NULL |
| Team C | NULL | NULL | 293 | NULL | NULL | 174 | NULL | NULL | 1.68 | NULL | NULL | 6 |
| Team C | NULL | 318 | NULL | NULL | 221 | NULL | NULL | 1.44 | NULL | NULL | 6 | NULL |
| Team C | 312 | NULL | NULL | 183 | NULL | NULL | 1.7 | NULL | NULL | 6 | NULL | NULL |
+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------+--------------+--------------+
Here the the team has been split as 3 rows for 4th, 5th and 6th month. I would like to get the o/p as:
+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------+--------------+--------------+
| Team | COAID_4 | COAID_5 | COAID_6 | CODID_4 | CODID_5 | CODID_6 | Avg_a_4 | Avg_a_5 | Avg_a_6 | Count_of_u_4 | Count_of_u_5 | Count_of_u_6 |
+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------+--------------+--------------+
| Team A | 9 | 14 | 17 | 7 | 14 | 15 | 1.29 | 1 | 1.13 | 5 | 6 | 7 |
| Team B | 15822 | 14287 | 12159 | 7117 | 6525 | 6482 | 2.22 | 2.19 | 1.88 | 40 | 39 | 40 |
| Team C | 312 | 318 | 293 | 183 | 221 | 174 | 1.7 | 1.44 | 1.68 | 6 | 6 | 6 |
+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------+--------------+--------------+
I am not sure, whats the mistake in my code.

A simple way you can use MAX
SELECT Team,
MAX(COAID_4),
MAX(COAID_5),
MAX(COAID_6),
....
FROM T
GROUP BY Team
T is your current query result SQL.
But I think you are looking for condition aggregate function to make the pivot.
SELECT
[team],
MAX(CASE WHEN month_from_Date = 4 THEN Count_Of_OrderId END) '4_COID',
MAX(CASE WHEN month_from_Date = 5 THEN Count_Of_OrderId END) '5_COID',
MAX(CASE WHEN month_from_Date = 6 THEN Count_Of_OrderId END) '6_COID',
MAX(CASE WHEN month_from_Date = 4 THEN Count_Of_OId END) '4_COID',
MAX(CASE WHEN month_from_Date = 5 THEN Count_Of_OId END) '5_COID',
MAX(CASE WHEN month_from_Date = 6 THEN Count_Of_OId END) '6_COID',
MAX(CASE WHEN month_from_Date = 4 THEN Avg_a END) '4_Avg_a',
MAX(CASE WHEN month_from_Date = 5 THEN Avg_a END) '5_Avg_a',
MAX(CASE WHEN month_from_Date = 6 THEN Avg_a END) '6_Avg_a',
MAX(CASE WHEN month_from_Date = 4 THEN Count_of_users END) '4_Count_of_u',
MAX(CASE WHEN month_from_Date = 5 THEN Count_of_users END) '5_Count_of_u',
MAX(CASE WHEN month_from_Date = 6 THEN Count_of_users END) '6_Count_of_u'
FROM [MyTable]
GROUP BY [team]

Related

Distincted ids for grouped values

I want to count the distinct ids in each numb and store them in a column :
Tried this:
WITH T AS(
SELECT
MAX(CASE WHEN LOGS like'CAR%' then REPLACE(LOGS,'CAR-','')end)as CAR,
MAX(CASE WHEN LOGS like 'MOT%' then REPLACE(LOGS,'MOT-','')end)as MOTO,
MAX(CASE WHEN LOGS like 'BICYCLE%' then REPLACE(LOGS,'BICYCLE-','')end)as BICYCLE,
MAX(CASE WHEN LOGS like 'SHIP%' then REPLACE(LOGS,'SHIP-','')end)as SHIP,
ID,
ORIG,
DATE_ID ,
NUMB,
STEPS
from dbo.test
group by ORIG,DATE_ID,ID ,NUMB,STEPS
)
SELECT ID,ORIG,NUMB,STEPS,DATE_ID,CAR,MOTO,BICYCLE,SHIP,
(SELECT COUNT(DISTINCT ID) FROM dbo.test tp WHERE ORIG= '4567') as COUNTER
from t
where ORIG= '4567'
and NUMB in('1515','1921','2121')
GROUP BY ID,ORIGIN_URI,NUMB,STEPS,DATE_ID,CAR,MOTO,BICYCLE,SHIP
Receive this output:
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| ID | ORIG | NUMB | STEPS | DATE_ID | CAR | MOTO | BICYCLE | SHIP | COUNTER |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| 1 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 1 | 4567 | 1515 | 2 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 1 | 4567 | 1515 | 3 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 2 | 4567 | 1921 | 1 | 20201111 | NULL | KTM | NULL | NULL | 3 |
| 3 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 2 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 3 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 4 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
As you can see COUNTER columns has the count of distincted ids but for all NUMB
I want to output this:
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| ID | ORIG | NUMB | STEPS | DATE_ID | CAR | MOTO | BICYCLE | SHIP | COUNTER |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| 1 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 1 | 4567 | 1515 | 2 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 2 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 2 | 4567 | 1921 | 1 | 20201111 | NULL | KTM | NULL | NULL | 1 |
| 3 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 3 | 4567 | 2121 | 2 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 3 | 4567 | 2121 | 3 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 1 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
1515 has 2 ids
1921 has 1 id
2121 has 2 ids
I tried also to place a GROUP BY NUMB inside (SELECT COUNT(DISTINCT ID) FROM dbo.test tp WHERE ORIG= '4567') but didn't work.
What you seem to want is:
count(distinct steps) over (partition by orig, numb)
Alas, SQL Server doesn't support count(distinct) with window functions.
Happily, there is an easy workaround (which begs the question as to why the above syntax is not supported):
(dense_rank() over (partition by orig, numb order by steps asc) +
dense_rank() over (partition by orig, numb order by steps desc) - 1
) as counter

Select all columns with only two distinct columns

In my SQL Server database, I have a table with many duplicate values and I need to fetch results with distinct columns EID and YEAR and select rows containing fewer NULL values or order the table and get a final DISTINCT column EID and YEAR rows.
For example: below the table with EID = E138442 and YEAR = 2019 occurs 21 times were in this duplicate the row containing fewer null values should be fetched
+---------+------+------+------+------+------+------+------+------+------+------+------+------+------+
| EID | YEAR | JAN | FEB | MAR | APR | MAY | JUN | JUL | AUG | SEP | OCT | NOV | DEC |
+---------+------+------+------+------+------+------+------+------+------+------+------+------+------+
| E050339 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
| E050339 | 2020 | NULL | 6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E050339 | 2020 | 13 | 6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E138348 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 | NULL |
| E138348 | 2019 | NULL | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| e138372 | 2019 | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E138440 | 2019 | NULL | NULL | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | NULL | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | NULL | 2 | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | NULL | 7 | 2 | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | NULL | 7 | 7 | 2 | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | NULL | 1 | 7 | 7 | 2 | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2019 | NULL | 1 | NULL | 7 | 7 | 2 | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2020 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
| E138442 | 2020 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
| E138442 | 2020 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
| E138442 | 2020 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
+---------+------+------+------+------+------+------+------+------+------+------+------+------+------+
I need a SQL query to fetch values as shown here:
+---------+------+------+------+------+------+------+------+------+------+------+------+------+------+
| EID | YEAR | JAN | FEB | MAR | APR | MAY | JUN | JUL | AUG | SEP | OCT | NOV | DEC |
+---------+------+------+------+------+------+------+------+------+------+------+------+------+------+
| E050339 | 2020 | 13 | 6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E138348 | 2019 | NULL | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| e138372 | 2019 | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E138440 | 2019 | NULL | NULL | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| E138442 | 2019 | NULL | 1 | NULL | 7 | 7 | 2 | 7 | 7 | 4 | 4 | 9 | 5 |
| E138442 | 2020 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
+---------+------+------+------+------+------+------+------+------+------+------+------+------+------+
The result table should have a final row with distinct columns EID and YEAR.
SELECT *
FROM TABLE_NAME C1
WHERE EXISTS (SELECT 1
FROM TABLE_NAME C2
WHERE C1.EID = C2.EID AND C1.YEAR = C2.YEAR
HAVING COUNT(*) = 1)
ORDER BY
c1.EID, c1.YEAR, c1.JAN, c1.FEB, c1.MAR, c1.APR,
c1.MAY, c1.JUN, c1.JUL, c1.AUG, c1.SEP, c1.OCT, c1.NOV, c1.DEC ASC;
I tried the above code but found irrelevant results
since you have no other way to distinguish members of a group and based on "select rows containing fewer NULL values " here is one way how you can do it by using ctes, its not clean but probably the only way:
with cte as (
SELECT *,
ISNULL(c1.JAN, 1) + ISNULL(c1.FEB,1) + ... + ISNULL(c1.DEC,1) AS NullCount
FROM
tablename
)
, cte2 as (
select EID , YEAR , min(NullCount) min_nullcount
from cte
group by EID , YEAR
)
select t.*
from
cte t
join cte2 tt
on t.EID = tt.EID
and t.YEAR = tt.YEAR
and t.NULLCount = tt.min_nullcount
If you have duplicate minimum null per group you can use query below :
select * from (
SELECT *,
ROW_NUMBER OVER (partition by EID , YEAR order by ISNULL(c1.JAN, 1) + ... + ISNULL(c1.DEC,1) AS rnk
FROM
tablename
) xx
WHERE rnk = 1

how to update missing records in sequence

i have missing records in a sequence and my current output looks like this
| 1882 | 25548860 | 4 | 30 | null | null |
| 1882 | 25548861 | 4 | 30 | null | null |
| 1882 | 25548882 | 4 | 30 | null | null |
| 1882 | 25548883 | 4 | 30 | null | null |
| 1882 | 25548884 | 4 | 30 | null | null |
| 1882 | 25548885 | 4 | 30 | null | null |
missing records in between until 2122
| 2122 | 25548860 | 4 | 30 | null | null |
| 2122 | 25548861 | 4 | 30 | null | null |
| 2122 | 25548882 | 4 | 30 | null | null |
| 2122 | 25548883 | 4 | 30 | null | null |
| 2122 | 25548884 | 4 | 30 | null | null |
| 2122 | 25548885 | 4 | 30 | null | null |
I want my output to be in below format. Suggest me a sql query that will update the records in monetdb between 1883 to 2121.
| 1882 | 25548860 | 4 | 30 | null | null |
| 1882 | 25548861 | 4 | 30 | null | null |
| 1882 | 25548882 | 4 | 30 | null | null |
| 1882 | 25548883 | 4 | 30 | null | null |
| 1882 | 25548884 | 4 | 30 | null | null |
| 1882 | 25548885 | 4 | 30 | null | null |
| 1883 | 25548860 | 4 | 30 | null | null |
| 1883 | 25548861 | 4 | 30 | null | null |
| 1883 | 25548882 | 4 | 30 | null | null |
| 1883 | 25548883 | 4 | 30 | null | null |
| 1883 | 25548884 | 4 | 30 | null | null |
| 1883 | 25548885 | 4 | 30 | null | null |
........ ..........
........ ..........
| 2122 | 25548860 | 4 | 30 | null | null |
| 2122 | 25548861 | 4 | 30 | null | null |
| 2122 | 25548882 | 4 | 30 | null | null |
| 2122 | 25548883 | 4 | 30 | null | null |
| 2122 | 25548884 | 4 | 30 | null | null |
| 2122 | 25548885 | 4 | 30 | null | null |
If you know in advance the range of missing ids, you can use generate_series(). Assuming that your table is called mytable and has columns (id, col1, col2, col3, col4, col5), you can duplicate the records that have id 1882 to fill the gap with the following query:
insert into mytable (id, co11, col2, col3, col4, col5)
select value, col1, col2, col3, col4, col5
from sys.generate_series(1883, 2121, 1)
cross join mytable t
where t.id = 1882
Assume the schema of your table is something like:
create table mytable(id int, col1 int, col2 int, col3 int, col4 int, col5 int);
You can fill in the "missing records" with:
insert into mytable
select *, 4, 30, null, null
from sys.generate_series(1884, 2121, 1),
(select distinct col1 from mytable where id = 1883) as tmp;
However, the new records will be appended to the existing records, so if you want to have them returned in the order you showed above, you need an additional order by:
select * from mytable order by id, col1;

How do I select every value that corresponds to another value in another table in SQL

The title may sound a little confusing but essentially what I am trying to do is I'm trying to pull from the table below. The query I used to create the table below was
select d.FOLDER_ID, d.PKG_ID, p.file_id, d.PKG_START_TIME, d.PKG_END_TIME, d.ISVALID, d.VALIDFILE_COUNT, d.INVALIDFILE_COUNT, d.ISLOADED, d.LOADEDFILE_COUNT, d.REJECTEDFILE_COUNT
from DATA_EXCHANGE_PACKAGE d full outer join PACKAGE_FILE p on d.PKG_ID = p.PKG_ID
where d.pkg_id = p.pkg_id
order by PKG_START_TIME asc
This table contains data from two different tables as you can see in the query and it selects the first records based on package start time.
What I am trying to achieve is I want a query which can choose the amount of pkg_id's to return but I want every file_Id chosen for the amount of pkg_id chosen. For example, in my database I may have 100 packages but I only want to choose every file_Id for the first 10 packages. How do I do this. I've only been able to choose the first 5 records using top and choose just 5 distinct pkg_id rows but not every file_id for those distinct 5 pkg_ID's. Any help would be appreciated. I understand group by and partition may work to achieve what I want but I haven't been successful. I'm not the greatest at SQL so this is why i'm struggling, I thought this query was going to be easier to create. I'm also certain the where statement is pointless but I kept it regardless.
Also let's assume the folder_Id is always 1.
+-----------+--------+---------+-------------------------+-------------------------+---------+-----------------+-------------------+----------+------------------+--------------------+
| FOLDER_ID | PKG_ID | file_id | PKG_START_TIME | PKG_END_TIME | ISVALID | VALIDFILE_COUNT | INVALIDFILE_COUNT | ISLOADED | LOADEDFILE_COUNT | REJECTEDFILE_COUNT |
+-----------+--------+---------+-------------------------+-------------------------+---------+-----------------+-------------------+----------+------------------+--------------------+
| 1 | 1 | 1 | 2019-11-19 14:59:24.343 | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| 1 | 2 | 2 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 3 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 4 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 5 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 6 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 7 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 8 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 9 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 10 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 2 | 11 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 12 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 13 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 14 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 15 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 16 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 17 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 18 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 19 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 20 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
| 1 | 3 | 21 | 2019-11-19 15:58:26.733 | NULL | 1 | 10 | 0 | NULL | NULL | NULL |
An example of what I want to achieve with the above data is only want to choose the first two distinct pkg_id's based on the pkg_start_time in ascending order. However when only choosing those two distinct pkg_id's I want every file_id for those pkg_id's outputted. The below table is what I want my query to select from the above table.
+-----------+--------+---------+-------------------------+--------------+---------+-----------------+-------------------+----------+------------------+--------------------+--------+
| FOLDER_ID | PKG_ID | file_id | PKG_START_TIME | PKG_END_TIME | ISVALID | VALIDFILE_COUNT | INVALIDFILE_COUNT | ISLOADED | LOADEDFILE_COUNT | REJECTEDFILE_COUNT | seqnum |
+-----------+--------+---------+-------------------------+--------------+---------+-----------------+-------------------+----------+------------------+--------------------+--------+
| 1 | 1 | 1 | 2019-11-19 14:59:24.343 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 |
| 1 | 2 | 2 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 1 |
| 1 | 2 | 3 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 2 |
| 1 | 2 | 4 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 3 |
| 1 | 2 | 5 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 4 |
| 1 | 2 | 6 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 5 |
| 1 | 2 | 7 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 6 |
| 1 | 2 | 8 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 7 |
| 1 | 2 | 9 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 8 |
| 1 | 2 | 10 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 9 |
| 1 | 2 | 11 | 2019-11-19 15:10:20.157 | NULL | 1 | 10 | 0 | NULL | NULL | NULL | 10 |
+-----------+--------+---------+-------------------------+--------------+---------+-----------------+-------------------+----------+------------------+--------------------+--------+
Edit: I have solved my question
I have no idea why you are using a full join, so I'm replacing it with an inner join. You want row_number():
select dp.*
from (select d.FOLDER_ID, d.PKG_ID, p.file_id, d.PKG_START_TIME,
d.PKG_END_TIME, d.ISVALID, d.VALIDFILE_COUNT,
d.INVALIDFILE_COUNT, d.ISLOADED, d.LOADEDFILE_COUNT,
d.REJECTEDFILE_COUNT,
row_number() over (partition by d.pkg_id order by p.file_id) as seqnum
from DATA_EXCHANGE_PACKAGE d inner join
PACKAGE_FILE p
on d.PKG_ID = p.PKG_ID
where d.pkg_id = p.pkg_id
) dp
where seqnum <= 10
order by PKG_START_TIME asc
I have solved the question I was asking. The query i have made which has solved my question is below.
select d.FOLDER_ID, d.PKG_ID, p.file_id, d.PKG_START_TIME, d.PKG_END_TIME, d.ISVALID, d.VALIDFILE_COUNT, d.INVALIDFILE_COUNT, d.ISLOADED, d.LOADEDFILE_COUNT, d.REJECTEDFILE_COUNT
from DATA_EXCHANGE_PACKAGE d full outer join PACKAGE_FILE p on d.PKG_ID = p.PKG_ID
where d.PKG_ID = p.PKG_ID and d.PKG_ID > (select max(d.PKG_ID) - 5 from DATA_EXCHANGE_PACKAGE d ) and d.FOLDER_ID = 1
order by PKG_START_TIME desc
This query will basically iterate through the table and select every record until 5 distinct pkg_id's have been chosen. I'm going to use this query in Python and set where that 5 value is as a parameter so users can choose the amount of packages they want to return. I could also instead of using max(d.PKG_Id). In this table every new package ID will be a value higher than the previous package ID so I could probably also use datetime and max but this query is good enough for now. Also the folder_ID value will be a parameter.

How do i join two row in respect to date and time(hour and minute)

I want join these row that date and time same but second are different. i use max function but it did not return me proper result.
| DateandTime | t1 | t2 | t3 | t4 | t5 | t6 | t7 |
|-------------------------|------|------|------|------|------|------|------|
| 2019-08-06 15:44:04.000 | NULL | 0 | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:44:03.000 | 0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 2019-08-06 15:43:04.000 | NULL | NULL | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:43:03.000 | 0 | 0 | NULL | NULL | NULL | NULL | NULL |
| 2019-08-06 15:42:04.000 | NULL | NULL | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:42:03.000 | 0 | 0 | NULL | NULL | NULL | NULL | NULL |
| 2019-08-06 15:41:04.000 | NULL | NULL | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:41:03.000 | 0 | 0 | NULL | NULL | NULL | NULL | NULL |
| 2019-08-06 15:40:04.000 | NULL | 0 | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:40:03.000 | 0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 2019-08-06 15:39:04.000 | NULL | 0 | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:39:03.000 | 0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 2019-08-06 15:38:04.000 | NULL | NULL | 29 | 20 | 150 | 20 | 20 |
| 2019-08-06 15:38:03.000 | 0 | 0 | NULL | NULL | NULL | NULL | NULL |
If you are using MSSQL, this following script will work. You can also use the same logic for other databases just applying appropriate DATETIME conversion for that database.
SELECT
FORMAT(CAST(DateandTime AS DATETIME), 'dd-MM.yyyy HH:mm'),
-- You can also use SUM/AVG for all columns if both row can
-- have data and based on your requirement.
MAX(t1) T1,
MAX(t2) T2,
MAX(t3) T3,
MAX(t4) T4,
MAX(t5) T5,
MAX(t6) T6,
MAX(t7) T7
FROM your_table
GROUP BY FORMAT(CAST(DateandTime AS DATETIME), 'dd-MM.yyyy HH:mm')
-- Do not required CAST(DateandTime AS DATETIME) if your
-- DateandTime column type is already DATETIME