SQL group-by largest

SQL group-by largest - sql

I have a table that has cooccurrence counts by objects that looks like the following:
col1 col2 count
item1 item2 3
item3 item2 1
item1 item4 1
item2 item3 2
I would like to do a group-by largest top n counts on col1, however if I do that on the above table since all object pairs are not available the result would be the following: (which is not correct)
col1 col2 count
item1 item2 3
item3 item2 1
item2 item3 2
If I swap the columns and then add them back to the same table this would be the result:
col1 col2 count
item1 item2 3
item3 item2 1
item1 item4 1
item2 item3 2
item2 item1 3
item2 item3 1
item4 item1 1
item3 item2 2
And the group by would yield: (the correct result)
col1 col2 count
item1 item2 3
item2 item1 3
item4 item1 1
item3 item2 2
What would the proper query be for producing this kind of group-by? Am I correct that the column would need to be swapped and concatenated or is there a better way to go about this? (I'm using Postgres)
In the above I am showing a group by top 1, for the sake of keeping the example simple, in reality this is a group by top 10

This answers the original version of the question.
I think you want:
select v.col1, v.col2, max(t.count)
from t cross join lateral
(values (col1, col2), (col2, col1)) v(col1, col2)
group by v.col1, v.col2;

Related

How to divide by a value in columnA that correlates to another value in ColumnB

I am creating a SQL report to show the number of viles on hand, sales orders, PO etc.
My system has everything in the base units (mL), and I need to divide by the DefaultPurchasingUnit which is 11. How do I do that if this is from one table?
Item UnitSize Unit DefaultPurchasingUnit
====== ======== ============== =====================
Item1 1 mL vile
Item1 11 vile vile
Item1 693 box vile

You can use join or window functions:
select t.*,
t.unitsize / tdef.unitsize
from t left join
t tdef
on t.item = tdef.item and
t.DefaultPurchasingUnit = tdef.unit;

You can use a left join on the same table:
SELECT A.Item, A.UnitSize AS SizeInML, A.UnitSize / B.UnitSize AS DefaultUnitSize, B.Unit, B.DefaultPurchasingUnit
FROM itemTable A
LEFT JOIN itemTable B
ON A.Item = B.Item AND A.DefaultPurchasingUnit = B.Unit
I ran this in SQL and it gave the following result where default purchasing unit for Item1 is vile and for Item2 is box with the same values as provided in your question (1, 11, 693 for mL, vile, box respectively):
Item SizeInML DefaultUnitSize Unit DefaultPurchasingUnit
====== ======== =============== ======= =====================
Item1 1 0.0909 vile vile
Item1 11 1 vile vile
Item1 693 63 vile vile
Item2 1 0.0014 box box
Item2 11 0.0159 box box
Item2 693 1 box box

How to get total record count for records that match 3 different criteria in single query

I have a table similar to this
Item1 Item2
yes yes
yes no
yes yes
yes yes
etc., etc.
I need to get the count of the records that have both Item1 & Item2. And also the counts for records that just have Item1 or Item2 and not have duplicate records in the final query. Any suggestions will be greatly appreciated as always.

Perhaps you just want group by:
select item1, item2, count(*)
from t
group by item1, item2;
If you specifically want to combine values, you could do:
select sum(case when item1 = 'yes' and item2 = 'yes' then 1 else 0 end) as two_yesses,
sum(case when (item1 = 'yes' or item2 = 'yes') and item1 <> item2
then 1 else 0
end) as one_yes
from t;

How do use SQL scripts in R

I need to write a SQL query
Here are my tables
x <- read.csv("C:/Users/Admin/Downloads/Set 1-1.csv",sep=",",dec=".")
y <- read.csv("C:/Users/Admin/Downloads/Set 1-2 - Copy.csv",sep=",",dec=".")
y$score <- 1
I tried joining it
library("sqldf")
select clientid,emailmessageid,null cnttrn,idatediff,null score from x
union all select clientid,emailmessageid,cnttrn,idatediff,score from y
But I get the following errors:
select clientid,emailmessageid,null cnttrn,idatediff,null score from x
Error: unexpected symbol in "select clientid"
union all select
clientid,emailmessageid,cnttrn,idatediff,score from y
Error:
unexpected symbol in "union all"
Please help to correct it. Thank you.
dput(x)
ClientID EmailMessageId MinDate MaxDate IdSlip WwsCreatedDate ProductArticle ProductGroupName MainProductGroupName CategoryGroupName QtytItems SumAmount iDateDiff
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 580.0 -342
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 3190.0 -342
dput(y)
ClientID EmailMessageId CntTrn iDateDiff score
86139F31664463A8B7592B6887B731A9FC2C3489BB1756A5BF334CFDEA4EF604 9EDCC1391C208BA0 1 4 1
BD483D69913E3EBFE5FBA87A1FFAB7DCD061055FFB4342C2F27AC01F36833254 EF72D53990BC4805 1 5 1
0B3B2F06C3033B3AFD83BA59B405BCC79BC69801FD3B69931F117B8D754A80EB 9EDCC1391C208BA0 1 3 1

This runs without errors for me. The only difference is that the query is formatted. Is the result correct?
library(sqldf)
y <- read.table(text = "ClientID EmailMessageId CntTrn iDateDiff score
86139F31664463A8B7592B6887B731A9FC2C3489BB1756A5BF334CFDEA4EF604 9EDCC1391C208BA0 1 4 1
BD483D69913E3EBFE5FBA87A1FFAB7DCD061055FFB4342C2F27AC01F36833254 EF72D53990BC4805 1 5 1
0B3B2F06C3033B3AFD83BA59B405BCC79BC69801FD3B69931F117B8D754A80EB 9EDCC1391C208BA0 1 3 1", header = TRUE)
x <- read.table(header = TRUE, text = "ClientID EmailMessageId MinDate MaxDate IdSlip WwsCreatedDate ProductArticle ProductGroupName MainProductGroupName CategoryGroupName QtytItems SumAmount iDateDiff
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 580.0 -342
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 3190.0 -342")
sqldf("
SELECT
ClientId,
EmailMessageId,
null CntTrn,
iDateDiff,
null Score
FROM x
UNION ALL
SELECT
ClientId,
EmailMessageId,
CntTrn,
iDateDiff,
Score
FROM y")

SSAS DAX Not Ordering Correctly

can anyone explain why this statement is not ordering correctly please?
Sample Workbook:- http://1drv.ms/1TRizj8
Basic query:-
EVALUATE
SUMMARIZE(
Data
,'data'[item]
,"TotalAmount", Sum(Data[Amount])
)
Result:-
Item TotalAmount
Item1 3.95128609469091
Item2 4.24529815278904
Item3 4.19327473518058
Item4 4.11105035459714
Item5 4.41249125008144
Item6 4.17408171753715
Altered Query:-
EVALUATE
SUMMARIZE(
Data
,'data'[item]
,"TotalAmount", Sum(Data[Amount])
)
order by "TotalAmount"
Actual Result:-
Item TotalAmount
Item1 3.95128609469091
Item2 4.24529815278904
Item3 4.19327473518058
Item4 4.11105035459714
Item5 4.41249125008144
Item6 4.17408171753715
Expected:-
Item TotalAmount
Item1 3.951286095
Item4 4.111050355
Item6 4.174081718
Item3 4.193274735
Item2 4.245298153
Item5 4.41249125
Hopefully i'm missing something really obvious here... ultimately i just want to get a TOPN() based on the biggest sellers of my real data but whenever i try to order by it goes all squiffy :/

worked it out with fresh eyes this morning, needed square brackets around the TotalAmount(!)
Query:
EVALUATE
SUMMARIZE(
Data
,'data'[item]
,"TotalAmount", Sum(Data[Amount])
)
order by [TotalAmount]
Results:
Item TotalAmount
Item1 3.95128609469091
Item4 4.11105035459714
Item6 4.17408171753715
Item3 4.19327473518058
Item2 4.24529815278904
Item5 4.41249125008144
sigh
:)

how to select entries by comparing two lists?

I need to select all entries where any item of a list is in some column
here is MY_TABLE:
COLUMN1 COLUMN2
1 item1, item2, item3, item4
2 item5, item6, item7, item8
3 item9, item10, item11, item12
4 item13, item14, item15, item16
5 item17, item18, item19, item20
So I need something like:
DECLARE #MY_LIST
SET MY_LIST = "'item1', 'item15'"
SELECT * FROM MY_TABLE WHERE #MY_LIST IN COLUMN2
and it should return:
COLUMN1 COLUMN2
1 item1, item2, item3, item4
4 item13, item14, item15, item16
MY_LIST can be an array and data in COLUMN2 we can convert into array as well.
So if there is an item in both MY_LIST and in COLUMN2 then I need this entry be selected.
Thank you very much for any response

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL group-by largest - sql

This answers the original version of the question. I think you want: select v.col1, v.col2, max(t.count) from t cross join lateral (values (col1, col2), (col2, col1)) v(col1, col2) group by v.col1, v.col2;

Related

How to divide by a value in columnA that correlates to another value in ColumnB

How to get total record count for records that match 3 different criteria in single query

How do use SQL scripts in R

SSAS DAX Not Ordering Correctly

how to select entries by comparing two lists?

Categories

Resources