Find rows with maximum count which have multiple joins in SQL - sql

firstname lastname quantity object no datecol
soman mitra 50 1 31-05-2021
nitya sharma 100 2 31-05-2021
tanisha agarwal 200 3 31-05-2021
tarun mittal 300 4 31-05-2021
Above is the output of multiple joined tables. Now, I want to find the rows which have the maximum quantity
How can I do this since I have multiple table joined. Please help

If you just want the row(s) having the maximum quantity, then use TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES -- current select list
FROM -- current query
ORDER BY quantity DESC;

Related

SQL query which will extract conditionally the values from top categories the first and the 2nd where CATEGORY is OTHER

I have this table. The table just a small example and has more obs.
id
CATEGORY
AMOUNT
1
TECH
120
1
FUN
220
2
OTHER
340
2
PARENTS
220
made by id category amount spent in each category.I want to select ID and Category in which the ID spents the most but in case if category is OTHER I want to get 2nd most spending category.
I have a constraint. I CANNOT use the the subquery and select with filter WHERE CATEGORY <> 'OTHER'. It just makes my machine to go out of the memory (For reasons Idk)
This is what I have tried.
I have tried to create a row_number () over (partition by id order by amount desc) rn.
and then
select id, category from table where row num = 1 group by 1,2
**buttt. I don't know how to say to query. If CATEGORY is OTHER then take row num=2 . **
id
CATEGORY
AMOUNT
ROW NUM
1
TECH
120
2
1
FUN
220
1
2
OTHER
340
1
2
PARENTS
220
2
Another thing I was thinking to do is to write qualify function
QUALIFY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY AMOUNT DESC) <1.
Also here I am getting only 1st records in which there is also OTHER. If I could filter it out within QUALIFY and say if CATEGORY is 'OTHER' don't consider it.
I am using Databricks.

BigQuery - count the count of a column

Newbie on SQL and BigQuery in general. How to count the count of a column in BigQuery? As you can see from the code sample, the query returns the count of appName as WhitelistNames, but I would like to get a count of WhitelistNames.
SELECT
COUNT(appName) AS WhitelistNames,
bridgeToken
FROM (
SELECT
bridgeToken,
appName
FROM
[DB]
GROUP BY
bridgeToken,
appName )
GROUP BY
bridgeToken
ORDER BY
WhitelistNames DESC
Current query return is:
Row UniquebridgeToken WhitelistEntries
1 11111 5
2 22222 13
3 33333 3
4 44444 3
5 55555 3
But I would like to count the occurrence of UniquebridgeToken like below. Thanks in advance.:
Row WhitelistEntries BridgeCount
1 13 1
2 5 1
3 3 3
Below is for BigQuery Standard SQL and based on how I interpreted your question - which is:
for each bridgeToken how many unique appName's and how many total entries (rows) for that bridge
#standardSQL
SELECT
COUNT(DISTINCT appName) AS WhitelistNames,
COUNT(bridgeToken) AS BridgeCount
FROM `project.dataset.your_table`
GROUP BY bridgeToken
I understand that you want is to count how many UniquebridgeToken have the same number of WhitelistEntries. I think what you are looking for is that:
WITH nestedQuery AS (SELECT
appName,
COUNT(appName) as WhitelistEntries
FROM `project_name.dataset_name.table_name`
GROUP BY
price)
SELECT n.WhitelistEntries, COUNT(n.WhitelistEntries) as BridgeCount
FROM nestedQuery as n
GROUP BY n.WhitelistEntries
You can read about WITH clause here: https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#with_clause

Using GROUP BY, select ID of record in each group that has lowest ID

I am creating a file orginization system where you can add content items to multiple folders.
I am storing the data in a table that has a structure similar to the following:
ID TypeID ContentID FolderID
1 101 1001 1
2 101 1001 2
3 102 1002 3
4 103 1002 2
5 103 1002 1
6 104 1001 1
7 105 1005 2
I am trying to select the first record for each unique TypeID and ContentID pair. For the above table, I would want the results to be:
ID
1
3
4
6
7
As you can see, the pairs 101 1001 and 103 1002 were each added to two folders, yet I only want the record with the first folder they were added to.
When I try the following query, however, I only get result that have at least two entries with the same TypeID and ContentID:
select MIN(ID)
from table
group by TypeID, ContentID
results in
ID
1
4
If I change MIN(ID) to MAX(ID) I get the correct amount of results, yet I get the record with the last folder they were added to and not the first folder:
ID
2
3
5
6
7
Am I using GROUP BY or the MIN wrong? Is there another way that I can accomplish this task of selecting the first record of each TypeID ContentID pair?
MIN() and MAX() should return the same amount of rows. Changing the function should not change the number of rows returned in the query.
Is this query part of a larger query? From looking at the sample data provided, I would assume that this code is only a snippet from a larger action you are trying to do. Do you later try to join TypeID, ContentID or FolderID with the tables the IDs are referencing?
If yes, this error is likely being caused by another part of your query and not this select statement. If you are using joins or multi-level select statements, you can get different amount of results if the reference tables do not contain a record for all the foreign IDs.
Another suggestion, check to see if any of the values in your records are NULL. Although this should not affect the GROUP BY, I have sometime encountered strange behavior when dealing with NULL values.
Use ROW_NUMBER
WITH CTE AS
(SELECT ID,TypeID,ContentID,FolderID,
ROW_NUMBER() OVER (PARTITION BY TypeID,ContentID ORDER BY ID) as rn FROM t
)
SELECT ID FROM CTE WHERE rn=1
Use it with ORDER BY:
select *
from table
group by TypeID, ContentID
order by id
SQLFiddle: http://sqlfiddle.com/#!9/024016/12
Try with first ( id) instead of min(id)
select first(id)
from table
group by TypeID, ContentID
It works ?

SQL Modification

I have a query (Main Query) is like this. I am executing this in Toad connected to Netezza DB.
SELECT *
FROM db1.schema1.Table1
WHERE (pd_num, pd_num_mtr, pd_num_prefix, sqr_num) IN
(SELECT pd_num,
pd_num_mtr,
pd_num_prefix,
max (sqr_num) sqr_num
FROM db1.schema1.table1
WHERE create_date >= '01/01/2012' AND cd_operator <> 'N'
GROUP BY pd_num, pd_num_mtr, pd_num_prefix)
When I execute this I get some 1 million records as my output. I further executed a query (Query2) to analyze the number of records belonging to the group as follows.
select pd_num_mtr,pd_num_prefix,count(*)
from db1.schema1.table1
GROUP BY pd_num, pd_num_mtr
order by count(*) desc
I get the below out put for this.
pd_num pd_num_mtr count(*)
001 15 500
002 15 200
003 30 100
Which means I have some 500 records pulled for the pd_num and pd_num_mtr combination with each of these records having an update_timestamp value. Now this needs to be modified as follows.
So among these 500 records, I need to pull only the one with maximum update_timestamp which will limit the count to only 1 record instead of 500.1 from 200 records, 1 record from 100 records with the max update timestamp value.
How can I modify the first query (main query) to acheive this? So that if the run the query2, I get the below as the output.
pd_num pd_num_mtr count(*)
001 15 1
002 15 2
003 30 3
Appreciate your help again. Thank you.
We will have to use row_number function for this. Assuming 'update_timestamp' as your timestamp column.
SELECT PD_NUM_MTR,PD_NUM_PREFIX
FROM
(
SELECT PD_NUM_MTR,PD_NUM_PREFIX,ROW_NUMBER() OVER (PARTITION BY PD_NUM_MTR,PD_NUM_PREFIX ORDER BY update_timestamp desc ) AS RK
FROM DB1.SCHEMA1.TABLE1
)
WHERE RK=1;

Sqlite: Selecting records spread over total records

I have a sql / sqlite question. I need to write a query that select some values from a sqlite database table. I always want the maximal returned records to be 20. If the total selected records are more than 20 I need to select 20 records that are spread evenly (no random) over the total records. It is also important that I always select the first and last value from the table when sorted on the date. These records should be inserted first and last in the result.
I know how to accomplish this in code but it would be perfect to have a sqlite query that can do the same.
The query Im using now is really simple and looks like this:
"SELECT value,date,valueid FROM tblvalue WHERE tblvalue.deleted=0 ORDER BY DATE(date)"
If I for example have these records in the talbe and to make an easier example the maximum result I want is 5.
id value date
1 10 2010-04-10
2 8 2010-04-11
3 8 2010-04-13
4 9 2010-04-15
5 10 2010-04-16
6 9 2010-04-17
7 8 2010-04-18
8 11 2010-04-19
9 9 2010-04-20
10 10 2010-04-24
The result I would like is spread evenly like this:
id value date
1 10 2010-04-10
3 8 2010-04-13
5 10 2010-04-16
7 8 2010-04-18
10 10 2010-04-24
Hope that explain what I want, thanks!
Something like this should work for you:
SELECT *
FROM (
SELECT v.value, v.date, v.valueid
FROM tblvalue v
LEFT OUTER JOIN (
SELECT min(DATE(date)) as MinDate, max(DATE(date)) as MaxDate
FROM tblvalue
WHERE tblvalue.deleted = 0
) vm on DATE(v.date) = vm.MinDate or DATE(v.date) = vm.MaxDate
WHERE tblvalue.deleted = 0
ORDER BY vm.MinDate desc, Random()
LIMIT 20
) a
ORDER BY DATE(date)
I think you want this:
SELECT value,date,valueid FROM tblvalue WHERE tblvalue.deleted=0
ORDER BY DATE(date), Random()
LIMIT 20
In other words you want select rows with date column, so that date is from the sorted list of dates, from where we take every odd element? And add the last recorded element (with the latest date)? And everything limited to max 20 rows?
If that's the case, then I think this one should do:
SELECT id,value,date FROM source_table WHERE date IN (SELECT date FROM source_table WHERE (rowid-1) % 2 = 0 OR date = (SELECT max(date) FROM source_table) ORDER BY date) LIMIT 20