PostgreSQL: Using the LEAST() command after GROUP BY to achieve first transactions - sql

I am working with a magento table like this:
+-----------+--------------+------------+--------------+----------+-------------+
| date | email | product_id | product_type | order_id | qty_ordered |
+-----------+--------------+------------+--------------+----------+-------------+
| 2017/2/15 | x#y.com | 18W1 | custom | 12 | 1 |
+-----------+--------------+------------+--------------+----------+-------------+
| 2017/2/15 | x#y.com | 18W2 | simple | 17 | 3 |
+-----------+--------------+------------+--------------+----------+-------------+
| 2017/2/20 | z#abc.com | 22Y34 | simple | 119 | 1 |
+-----------+--------------+------------+--------------+----------+-------------+
| 2017/2/20 | z#abc.com | 22Y35 | custom | 31 | 2 |
+-----------+--------------+------------+--------------+----------+-------------+
I want to make a new view by grouping by email, and then taking the row with the LEAST of order_id only.
So my final table after doing this operation from above should look like this:
+-----------+--------------+------------+--------------+----------+-------------+
| date | email | product_id | product_type | order_id | qty_ordered |
+-----------+--------------+------------+--------------+----------+-------------+
| 2017/2/15 | x#y.com | 18W1 | custom | 17 | 1 |
+-----------+--------------+------------+--------------+----------+-------------+
| 2017/2/15 | z#abc.com | 18W2 | simple | 31 | 3 |
+-----------+--------------+------------+--------------+----------+-------------+
I'm trying to use the following query (but it's not working):
SELECT * , (SELECT DISTINCT table.email, table.order_id,
LEAST (order_id) AS first_transaction_id
FROM
table
GROUP BY
email)
FROM table;
Would really love any help with this, thank you!

I think you want distinct on:
select distinct on (email) t.*
from t
order by email, order_id;
distinct on is a Postgres extension. It takes one record for all combinations of keys in parentheses, based on the order by clause. In this case, it is one row per email, with the first one being the one with the smallest order_id (because of the order by). The keys in the select also need to be the first keys in the order by.

Related

MariaDB joining tables on themselfs

Ok, I've googled, i've tried but mostly i failed.
I've got a table with 5 columns
ID (just a primary key)
UserUUID
Category
Value
I can pull a query where i get the rankings of a specific category for all users
SELECT
RANK() OVER (PARTITION BY t1.cat ORDER BY value DESC) as rank,
t1.UUID, t1.cat, t1.value
FROM t1
WHERE t1.cat='Category1'
ORDER by t1.value DESC
So this outputs something like:
| 1 | sdc9c4-541 | cat1 | 16102 |
| 2 | sqdf5d-542 | cat1 | 7313 |
| 3 | sqsd5d-685 | cat1 | 7116 |
| 4 | s45sdf-213 | cat1 | 4158 |
.....
This works, but now i'm trying to get the reverse view on this.
So I'm trying to pull a query where i get the rankings of a user category for all categories
The desired output should look something like:
| 1 | sdc9c4-541 | cat1 | 16102 |
| 37 | sdc9c4-541 | cat2 | 25 |
| 15 | sdc9c4-541 | cat3 | 2345 |
| 2 | sdc9c4-541 | cat4 | 912 |
This showing the Rank, User, Category and value's. where the rank represents the users ranking on that category in comparison with other users
I've already messed around with subqueries, with clauses, variables, joins. but i can't get this result to come out and work.
Is there anybody that can give me some pointers in what direction i need to look to make this work.
Thanks in advance

Creating customer_id based on matchcodes in Oracle (SQL)

I have an oracle database containing purchases of customers. (one record is one purchase) Customers provided their personal data again and again at every purchase. So there can be differences due to mistype, address change, etc. Now I have to identify purchases belonging to the same customer.
To do that I created 3 different match code based on simple rules. My table looks somehow like this now:
+-------------+-------------+-------------+-------------+-------------+
| PURCACHE_ID | MATCHCODE_1 | MATCHCODE_2 | MATCHCODE_3 | CUSTOMER_ID |
+-------------+-------------+-------------+-------------+-------------+
| | | | | |
| 1 | 1 | b | x | |
| | | | | |
| 2 | 1 | a | y | |
| | | | | |
| 3 | 2 | c | x | |
| | | | | |
| 4 | 3 | a | z | |
| | | | | |
| ... | ... | ... | ... | ... |
+-------------+-------------+-------------+-------------+-------------+
What I want to do is to assign a customer_id to every purchase. Same customer_id would be assigned to purchases where any matchcode equals to another one.
So for example purchase 1 and purchase 2 would receive the same customer_id because matchcode 1 is the same. Also purchase 2 and purchase 4 belong to the same customer cause Matchcode_2 is the same. Thereby even purchase 1 and purchase 4 would receive the same customer_id though none of their matchcodes equals.
Customer_id can be a simple number starting from 1.
What is the SQL code to make Customer_Id?
A naive solution:
-- Just number them
UPDATE purchases SET customer_id = rownum;
-- Group all customers with given matchcode_1 into one
MERGE INTO purchases p
USING (SELECT matchcode_1, min(customer_id) customer_id
FROM purchases
GROUP BY matchcode_1) m
ON (p.matchcode_1 = m.matchcode_1)
WHEN MATCHED THEN
UPDATE SET p.customer_id = m.customer_id;
-- Repeat the above merge for matchcode_2, matchcode_3
-- then matchcode_1 again and so on
-- until none of the matchcodes make any updates
You could write something nicer with PL/SQL probably...

Filter by value in last row of LEFT OUTER JOIN table

I have a Clients table in PostgreSQL (version 9.1.11), and I would like to write a query to filter that table. The query should return only clients which meet one of the following conditions:
--The client's last order (based on orders.created_at) has a fulfill_by_date in the past.
OR
--The client has no orders at all
I've looked for around 2 months, on and off, for a solution.
I've looked at custom last aggregate functions in Postgres, but could not get them to work, and feel there must be a built-in way to do this.
I've also looked at Postgres last_value window functions, but most of the examples are of a single table, not of a query joining multiple tables.
Any help would be greatly appreciated! Here is a sample of what I am going for:
Clients table:
| client_id | client_name |
----------------------------
| 1 | FirstClient |
| 2 | SecondClient |
| 3 | ThirdClient |
Orders table:
| order_id | client_id | fulfill_by_date | created_at |
-------------------------------------------------------
| 1 | 1 | 3000-01-01 | 2013-01-01 |
| 2 | 1 | 1999-01-01 | 2013-01-02 |
| 3 | 2 | 1999-01-01 | 2013-01-01 |
| 4 | 2 | 3000-01-01 | 2013-01-02 |
Desired query result:
| client_id | client_name |
----------------------------
| 1 | FirstClient |
| 3 | ThirdClient |
Try it this way
SELECT c.client_id, c.client_name
FROM clients c LEFT JOIN
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY created_at DESC) rnum
FROM orders
) o
ON c.client_id = o.client_id
AND o.rnum = 1
WHERE o.fulfill_by_date < CURRENT_DATE
OR o.order_id IS NULL
Output:
| CLIENT_ID | CLIENT_NAME |
|-----------|-------------|
| 1 | FirstClient |
| 3 | ThirdClient |
Here is SQLFiddle demo

Create a pivot table from two tables based on dates

I have two MS Access tables sharing a one to many relationship. Their structures are like the following:
tbl_Persons
+----------+------------+-----------+
| PersonID | PersonName | OtherData |
+----------+------------+-----------+
| 1 | PersonA | etc. |
| 2 | PersonB | |
| 3 | PersonC | |
tbl_Visits
+----------+------------+------------+-----------------------
| VisitID | PersonID | VisitDate | dozens of other fields
+----------+------------+------------+-----------
| 1 | 1 | 09/01/13 |
| 2 | 1 | 09/02/13 |
| 3 | 2 | 09/03/13 |
| 4 | 2 | 09/04/13 | etc...
I wish to create a new table based on the VisitDate field, the column headings of which are Visit-n where n is 1 to the number of visits, Visit-n-Data1, Visit-n-Data2, Visit-n-Data3 etc.
MergedTable
+----------+----------+---------------+-----------------+----------+----------------+
| PersonID | Visit1 | Visit1Data1 | Visit1Data2... | Visit2 | Visit2Data1... |
+----------+----------+---------------+-----------
| 1 | 09/01/13 | | | 09/02/13 |
| 2 | 09/03/13 | | | 09/04/13 |
| 3 | etc. | |
I am really not sure how to do this. Whether SQL query or using DAO then looping through records and columns. It is essential that there is only 1 PersonID per row and all his data appears chronologically into columns.
Start of by ranking the visits with something like
SELECT PersonID, VisitID,
(SELECT COUNT(VisitID) FROM tbl_Visits AS C
WHERE C.PersonID = tbl_Visits.PersonID
AND C.VisitDate < tbl_Visits.VisitDate) AS RankNumber
FROM tbl_Visits
Use this query as a base for the 'pivot'
Since you seem to have some visits of persons on the same day (visit 1 and 2) the WHERE clause needs to be a bit more sophisticated. But I hope you get the basic concept.
Pivoting can be done with multiple LEFT JOINs.
I question if my solution will have a high performance, since I did not test it. It is easier in SQL Server than in MS Access to accomplish.

LEFT JOINing the max/top

I have two tables from which I'm trying to run a query to return the maximum (or top) transaction for each person. I should note that I cannot change the table structure. Rather, I can only pull data.
People
+-----------+
| id | name |
+-----------+
| 42 | Bob |
| 65 | Ted |
| 99 | Stu |
+-----------+
Transactions (there is no primary key)
+---------------------------------+
| person | amount | date |
+---------------------------------+
| 42 | 3 | 9/14/2030 |
| 42 | 4 | 7/02/2015 |
| 42 | *NULL* | 2/04/2020 |
| 65 | 7 | 1/03/2010 |
| 65 | 7 | 5/20/2020 |
+---------------------------------+
Ultimately, for each person I want to return the highest amount. If that doesn't work then I'd like to look at the date and return the most recent date.
So, I'd like my query to return:
+----------------------------------------+
| person_id | name | amount | date |
+----------------------------------------+
| 42 | Bob | 4 | 7/02/2015 | (<- highest amount)
| 65 | Ted | 7 | 5/20/2020 | (<- most recent date)
| 99 | Stu | *NULL* | *NULL* | (<- no records in Transactions table)
+----------------------------------------+
SELECT People.id, name, amount, date
FROM People
LEFT JOIN (
SELECT TOP 1 person_id
FROM Transactions
WHERE person_id = People.id
ORDER BY amount DESC, date ASC
)
ON People.id = person_id
I can't figure out what I am doing wrong, but I know it's wrong. Any help would be much appreciated.
You are almost there but since there are duplicate Id in the Transaction table ,so you need to remove those by using Row_number() function
Try this :
With cte as
(Select People,amount,date ,row_number() over (partition by People
order by amount desc, date desc) as row_num
from Transac )
Select * from People as a
left join cte as b
on a.ID=b.People
and b.row_num=1
The result is in Sql Fiddle
Edit: Row_number() from MSDN
Returns the sequential number of a row within a partition of a result set,
starting at 1 for the first row in each partition.
Partition is used to group the result set and Over by clause is used
Determine the partitioning and ordering of the rowset before the
associated window function is applied.