Need help with SQL query - sql

The following is a simplified version of a database table I'm querying (let's call it Payments):
date | userid | payment
20/1/10 | 1 | 10
21/1/10 | 1 | 15
17/1/10 | 2 | 7
18/1/10 | 2 | 9
It records payments made by users on certain dates. I need to find out the details of the first payment made by each user like so:
20/1/10 | 1 | 10
17/1/10 | 2 | 7
Stored procedures are out of the question. Is there any way to do this using SQL alone or should I just add a first payment flag to the table?

SELECT MIN([Date]), userid, payment
FROM Payments
GROUP BY Userid, payment

SELECT MIN([Date]), UserID FROM Payments GROUP BY UserID

Try this:
SELECT * FROM Payments
INNER JOIN (SELECT Min([Date]) AS MinDate, UserID
FROM Payments GROUP BY UserID) AS M
ON M.MinDate = Payments.Date AND M.UserID = Payments.UserID

Related

Show column values as comma seperated in grafana

I have a table with 2 columns
organization_id | user_name
1 | abc
1 | xyz
2 | bhi
2 | ipq
2 | sko
3 | ask
...
Each organization could have any number of users ranging from 1 to 100, 2000 and so on.
I wanted to show them in grafana in a table as following:
organization_id | user_name
1 | abc, xyz
2 | bhi, ipq, sko
3 | ask
Since there could be many users I want to show any 10 users belonging to same organization.
The database here is timescale db, the table is also a time series table showing when user was registered
If I understand rightly that you want 10 users per organisation you can use the query below.
I have added group by in the CTE to avoid returning duplicate user_name's.
In the test schema there are duplicate values of 'pqr' for organisation 2 but this username is only returned once even though there are less then 10 user_name's for 2
test schema db Fiddle here
With topTen as
(Select
Organisation_id,
User_name,
Rank() over (
partition by organisation_id
order by user_name) rn
From table_name
group by
Organisation_id,
user_name)
Select
Organisation_id,
String_agg(user_name,',') users
From topTen
Where rn <= 10
group by Organisation_id;
organisation_id | users
--------------: | :--------------------------------------
1 | abc,abk,def,ghi,jkl,mno,pqr,rst,ruk,stu
2 | abk,pqr,rst,ruk,stu,vwx
Another alternative which may be useful. If you remove the where and put the following after From topTen you will get all the distinct user_names, 10 per row.
group by Organisation_id,rn/10
order by Organisation_id,rn/10;
db<>fiddle here

MS Access 2016 - Pull client name from separate table in complex query

I have three tables for vulnerability scanning jobs: customers, authorization forms, and scans. Relationships are one to many from left to right. I previously had scans directly related to clients, but implemented the forms table to add the ability to prevent scanning without authorization. I have the below query which pulls the dates of the most recent and next coming scans (huge thanks to #donPablo), but when I made the change in tables I'm no longer pulling the correct data from the customers table. I'm not exactly sure how to fix it.
SELECT u.Customer_Company, z.*
FROM (Select
NZ(a.Scan_Data.Customer_ID, b.Scan_Data.Customer_ID) as Customer,
aPast as Past,
aFuture as Future,
DATEDIFF("d", aPast, aFuture) as Difference
FROM
(Select Scan_Data.Customer_ID, Max(Scan_Date) as aPast from Scan_Data where Scan_Date <= DATE() Group By Scan_Data.Customer_ID) a
LEFT JOIN
(Select Scan_Data.Customer_ID, Min(Scan_Date) as aFuture from Scan_Data where Scan_Date > DATE() Group By Scan_Data.Customer_ID) b
ON a.Scan_Data.Customer_ID = B.Scan_Data.Customer_ID
UNION
Select
NZ(a.Scan_Data.Customer_ID, b.Scan_Data.Customer_ID) as Customer,
aPast as Past,
aFuture as Future,
DATEDIFF("d", aPast, aFuture) as Difference
FROM
(Select Scan_Data.Customer_ID, Max(Scan_Date) as aPast from Scan_Data where Scan_Date <= DATE() Group By Scan_Data.Customer_ID) a
RIGHT JOIN
(Select Scan_Data.Customer_ID, Min(Scan_Date) as aFuture from Scan_Data where Scan_Date > DATE() Group By Scan_Data.Customer_ID) b
ON a.Scan_Data.Customer_ID = B.Scan_Data.Customer_ID
) AS z LEFT JOIN Customer_Data AS u ON cint(z.Customer) = cint(u.Customer_ID);
In this query the Scan_Data.Customer_ID winds up being the FormID and it then pulls the customer's name based on the FormID. I fixed it in my other queries by doing a double inner join to pull the actual CustomerID based on the FormID, but I can't get that to work here because of the existing joins. Form_Data.Customer_ID is the way it's identified in the Form table. All IDs in their primary tables are autonumber generated PKs.
Customer_Data table:
.Customer_ID | .Customer_Name | etc.
1 | Microsoft |
2 | Reddit |
Form_Data table:
.Form_ID | .Signature_Date | .Expiration_Date | .Customer_ID
1 | 01-Jan-19 | 01-Jan-20 | 2/Reddit
2 | 15-May-18 | 15-May-21 | 1/Microsoft
Scan_Data table:
.Scan_ID | .Scan_Title | .Scan_Date | .Customer_ID
1 | First MS 19052018 | 19-May-18 | 1/2/Reddit
2 | First R 05012019 | 05-Jan-19 | 2/1/Microsoft
The above Scan_Data shows the problem I'm having. The numbers in the Scan_Data.Customer_ID field are the PKs from the other two tables. The .Customer_ID field is pulling the customer ID based upon the form ID and not the actual customer ID. It should show like this:
.Scan_ID | .Scan_Title | .Scan_Date | .Customer_ID
1 | First MS 19052018 | 19-May-18 | 2/1/Microsoft
2 | First R 05012019 | 05-Jan-19 | 1/2/Reddit

More efficient way to query shortest string value associated with each value in another column in Hive QL

I have a table in Hive containing store names, order IDs, and User IDs (as well as some other columns including item ID). There is a row in the table for every item purchased (so there can be more than one row per order if the order contains multiple items). Order IDs are unique within a store, but not across stores. A single order can have more than one user ID associated with it.
I'm trying to write a query that will return a list of all stores and order IDs and the shortest user ID associated with each order.
So, for example, if the data looks like this:
STORE | ORDERID | USERID | ITEMID
------+---------+--------+-------
| a | 1 | bill | abc |
| a | 1 | susan | def |
| a | 2 | jane | abc |
| b | 1 | scott | ghi |
| b | 1 | tony | jkl |
Then the output would look like this:
STORE | ORDERID | USERID
------+---------+-------
a | 1 | bill
a | 2 | jane
b | 1 | tony
I've written a query that will do this, but I feel like there must be a more efficient way to go about it. Does anybody know a better way to produce these results?
This is what I have so far:
select
users.store, users.orderid, users.userid
from
(select
store, orderid, userid, length(userid) as len
from
sales) users
join
(select distinct
store, orderid,
min(length(userid)) over (partition by store, orderid) as len
from
sales) len on users.store = len.store
and users.orderid = len.orderid
and users.len = len.len
Check out probably this will work for you, here you can achieve your goal of single "SELECT" clause with no extra overhead on SQL.
select distinct
store, orderid,
first_value(userid) over(partition by store, orderid order by length(userid) asc) f_val
from
sales;
The result will be:
store orderid f_val
a 1 bill
a 2 jane
b 1 tony
Probably rank() is the best way:
select s.*
from (select s.*, rank() over (partition by store order by length(userid) as seqnum
from sales s
) s
where seqnum = 1;

How to find every customers' favourite category with a query

I have a table in MS Access which looks basically like this:
Table Name : Customer_Categories
+----------------------+------------+-------+
| Email | CategoryID | Count |
+----------------------+------------+-------+
| jim#example.com | 10 | 4 |
+----------------------+------------+-------+
| jim#example.com | 2 | 1 |
+----------------------+------------+-------+
| simon#example.com | 5 | 2 |
+----------------------+------------+-------+
| steven#example.com | 10 | 16 |
+----------------------+------------+-------+
| steven#example.com | 5 | 3 |
+----------------------+------------+-------+
In this table there are ≈ 350,000 records. The characteristics are this:
Duplicate values for Email, CategoryID and Count
Count refers to the number of times this customer has ordered from this category
What I want
I want to create a table that consists of a unique email address along with the CategoryID this customer has purchased from the most.
So the above example would be:
+----------------------+------------+
| Email | CategoryID |
+----------------------+------------+
| jim#example.com | 10 |
+----------------------+------------+
| simon#example.com | 5 |
+----------------------+------------+
| steven#example.com | 10 |
+----------------------+------------+
What I have tried
I have written a query that achieves what I want:
SELECT main.Email, (SELECT TOP 1 CategoryID
FROM Customer_Categories
WHERE main.Email = Email
GROUP BY CategoryID
ORDER BY MAX(Count) DESC, CategoryID ASC) AS Category
FROM Customer_Categories AS main
GROUP BY main.Email;
This works a treat and does exactly what I want. It returns results in around 8 seconds. However I need this data in a new table because I then want to update another table with the categoryID. When I add INTO Customer_Favourite_Categories after the sub-query to add this data to a new table rather than just return the result set and run the query it never finishes. I've left it running for about 45 minutes and it does nothing.
Is there any way around this?
If select into doesn't work, use insert into:
create table Customer_Favorite_Categories (
email <email type>,
FavoriteCategory <CategoryId type>
);
insert into Customer_Favorite_Categories
SELECT main.Email, (SELECT TOP 1 CategoryID
FROM Customer_Categories
WHERE main.Email = Email
GROUP BY CategoryID
ORDER BY MAX(Count) DESC, CategoryID ASC) AS Category
FROM Customer_Categories AS main
GROUP BY main.Email;
Try this:
SELECT Distinct(Email),Max(CategoryID )
FROM Customer_Categories group by Email
I use sub-queries for this quite frequently. Your query in "What I have tried" is close, but just a little off in syntax. Something like the following should get what you are after. Count is in square-brackets since it's a reserved word in SQL. The spacing I use in my SQL is conventional, so edit to your liking.
SELECT Email,
CategoryID
FROM MyTable AS m,
(
SELECT Email,
MAX( [Count] ) AS mc
FROM MyTable
GROUP BY Email
) AS f
WHERE m.Email = f.Email
AND m.[Count] = f.mc;

SQL - Select unique rows from a group of results

I have wrecked my brain on this problem for quite some time. I've also reviewed other questions but was unsuccessful.
The problem I have is, I have a list of results/table that has multiple rows with columns
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 005DTHGP | 1966 | 2006-09-12 | Tracker
| 013DTHGP | 2281 | 2006-11-01 | Tracker
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 2404 | 2006-10-20 | Tracker
| 017DTNGP | 508 | 2007-11-10 | MBio
I am trying to select rows with unique REGISTRATIONS and where the DATE is max (the latest). The IDs are not proportional to the DATE, meaning the ID could be a low value yet the DATE is higher than the other matching row and vise-versa. Therefore I can't use MAX() on both the DATE and ID and grouping just doesn't seem to work.
The results I want are as follows;
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 508 | 2007-11-10 | MBio
PLEASE HELP!!!?!?!?!?!?!?
You want embedded queries, which not all SQLs support. In t-sql you'd have something like
select r.registration, r.recent, t.id, t.unittype
from (
select registration, max([date]) recent
from #tmp
group by
registration
) r
left outer join
#tmp t
on r.recent = t.[date]
and r.registration = t.registration
TSQL:
declare #R table
(
Registration varchar(16),
ID int,
Date datetime,
UnitType varchar(16)
)
insert into #R values ('A','1','20090824','A')
insert into #R values ('A','2','20090825','B')
select R.Registration,R.ID,R.UnitType,R.Date from #R R
inner join
(select Registration,Max(Date) as Date from #R group by Registration) M
on R.Registration = M.Registration and R.Date = M.Date
This can be inefficient if you have thousands of rows in your table depending upon how the query is executed (i.e. if it is a rowscan and then a select per row).
In PostgreSQL, and assuming your data is indexed so that a sort isn't needed (or there are so few rows you don't mind a sort):
select distinct on (registration), * from whatever order by registration,"date" desc;
Taking each row in registration and descending date order, you will get the latest date for each registration first. DISTINCT throws away the duplicate registrations that follow.
select registration,ID,date,unittype
from your_table
where (registration, date) IN (select registration,max(date)
from your_table
group by registration)
This should work in MySQL:
SELECT registration, id, date, unittype FROM
(SELECT registration AS temp_reg, MAX(date) as temp_date
FROM table_name GROUP BY registration) AS temp_table
WHERE registration=temp_reg and date=temp_date
The idea is to use a subquery in a FROM clause which throws up a single row containing the correct date and registration (the fields subjected to a group); then use the correct date and registration in a WHERE clause to fetch the other fields of the same row.