SQL latest/top items in category - sql

What is a scalable way to select latest 10 items from each category.
I have a schema list this:
item category updated
so I want to select 10 last update items from each category. The current solution I can come up with is to query for categories first and then issue some sort of union query:
categories = sql.execute("select categories from categories_table")
query = ""
for cat in categories:
query += "union select top 10 from table where category=cat order by updated"
result = sql.execute(query)
I am not sure how efficient this will be for bigger databases (1 million rows).
If there is a way to do this in one go - that would be nice.
Any help appreciated.

This will not compile but you'll have the general idea:
from i in table
group i by i.category into g
select new { cat = g.Key, LastTens = g.OrderByDescending(o => o.Updated).Take(10).Select(...) }
EDIT: the question asked for SQL:
SELECT * FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY categoryId ORDER BY somedate) AS PartNum,
categoryId,
[...]
FROM
category
) AS Partitionned
WHERE PartNum <= 10

Related

How to deselect duplicate entries in a query?

I've got a query like this:
SELECT *
FROM RecipeTable, RecipeIngredientTable, SyncRecipeIngredientTable
WHERE RecipeTable.recipe_id = SyncRecipeIngredientTable.recipe_id
AND RecipeIngredientTable.recipe_ingredient_id =
SyncRecipeIngredientTable.recipe_ingredient_id
AND RecipeIngredientTable.recipe_item_name in ("ayva", "pirinç", "su")
GROUP by RecipeTable.recipe_id
HAVING COUNT(*) >= 3;
and this query returns the result like this:
As you can see in the image there is 3 duplicate, unnecessary entries (no, i can't delete them because of the multiple foreign keys). How can I deselect these duplicate entries from the result query? In the end I want to return 6 entries not 9.
What you want to eliminate in the result set is not duplication of recipe_id values but recipe_name values.
You just need to group(partition) by recipe_name through use of ROW_NUMBER() analytic function :
SELECT recipe_id, author_name ...
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY recipe_name) AS rn,
sr.recipe_id, author_name ...
FROM SyncRecipeIngredientTable sr
JOIN RecipeIngredientTable ri
ON ri.recipe_ingredient_id = sr.recipe_ingredient_id
JOIN RecipeTable rt
ON rt.recipe_id = sr.recipe_id
WHERE ri.recipe_item_name in ("ayva", "pirinç", "su")
)
WHERE rn = 1
This way, you can pick only one of the records with rn=1 (ORDER BY Clause might be added to that analytic function after PARTITION BY clause if spesific record is needed to be picked)

MS-ACCESS / SQL - How to apply where clause in multiple conditions

SELECT Stock.*
FROM Stock
WHERE (
(
(Stock.ComputerPartNumber) In (SELECT [ComputerPartNumber] FROM [Stock] As Tmp GROUP BY [ComputerPartNumber] HAVING Count(*)=2)
)
AND
(
(Stock.EquipmentName)="EquipmentA" Or (Stock.EquipmentName)="EquipmentB")
)
OR (
(
(Stock.ComputerPartNumber) In (SELECT [ComputerPartNumber] FROM [Stock] As Tmp GROUP BY [ComputerPartNumber] HAVING Count(*)=1)
)
AND (
(Stock.EquipmentName)="EquipmentA" Or (Stock.EquipmentName)="EquipmentB"
)
);
I am using the above SQL to achieve below 3 items:-
Find out all of the ComputerPartNumber which used by EquipmentA and/or EquipmentB only
Filter out the query result if the ComputerPartNumber used by equipment other than EquipmentA and EquipmentB.
If the ComputerPartNumber is used by both EquipmentA and EquipmentC, filter out the result also.
However the item 3 cannot be filtered out successfully. What should I do in order to achieve the item3?
Table and Query snapshots are attached. Thanks in advance!
Table
Query
What you need to do is to check if the total number of times a part is used in all pieces of Equipment is equal to the total number of times a part is used by either Equipment A or B:
SELECT S.StorageID, S.ComputerPartNumber, S.EquipmentName, S.Result
FROM Stock AS S
WHERE
(SELECT COUNT(*) FROM Stock AS S1 WHERE S1.ComputerPartNumber=S.ComputerPartNumber)
=(SELECT COUNT(*) FROM Stock AS S2 WHERE S2.ComputerPartNumber=S.ComputerPartNumber AND S2.EquipmentName IN("EquipmentA","EquipmentB"))
Regards,
You can use not exists:
select s.*
from stock as s
where not exists (select 1
from stock as s2
where s2.ComputerPartNumber = s.ComputerPartNumber and
s2.EquipmentName not in ("EquipmentA", "EquipmentB")
);

SQL Views (id + count_table1_column1 + count_table2_column_1)

Im doing following query to select out a serialnumber from table Alerts, and then count how many alerts there is for that serialnumber together with the count on how many measurements there also is for that serialnumber. Measurements is stored in another table. (first 2 queries is jsut there to show you the result for better understanding)
SELECT InstrumentSerialNumber FROM [dbo].[CloudMeasurements]
SELECT InstrumentSerialNumber FROM [dbo].[CloudAlerts]
SELECT
DISTINCT InstrumentSerialNumber,
(SELECT COUNT(*) FROM [CloudAlerts] WHERE [CloudAlerts].InstrumentSerialNumber = InstrumentSerialNumber) AS Alerts,
(SELECT COUNT(*) FROM [CloudMeasurements] WHERE [CloudMeasurements].InstrumentSerialNumber = InstrumentSerialNumber) AS Measurements
FROM [CloudAlerts]
Result
See picture for result of the query.
I assume it respond with Count(*) summarized which makes it wrong from my perspective. How do I write this?
Greetings
Try joining the results of their groups:
SELECT
A.InstrumentSerialNumber,
A.TotalAlerts,
ISNULL(M.TotalMeasurements, 0) TotalMeasurements
FROM
(SELECT InstrumentSerialNumber, COUNT(*) TotalAlerts FROM [CloudAlerts] GROUP BY InstrumentSerialNumber) AS A
LEFT JOIN (SELECT InstrumentSerialNumber, COUNT(*) TotalMeasurements FROM [CloudMeasurements] GROUP BY InstrumentSerialNumber)
AS M ON M.InstrumentSerialNumber = A.InstrumentSerialNumber

Getting row number for query

I have a query which will return one row. Is there any way I can find the row index of the row I'm querying when the table is sorted?
I've tried rowid but got #582 when I was expecting row #7.
Eg:
CategoryID Name
I9GDS720K4 CatA
LPQTOR25XR CatB
EOQ215FT5_ CatC
K2OCS31WTM CatD
JV5FIYY4XC CatE
--> C_L7761O2U CatF <-- I want this row (#5)
OU3XC6T19K CatG
L9YKCYAYMG CatH
XKWMQ7HREG CatI
I've tried rowid with unexpected results:
SELECT rowid FROM Categories WHERE CategoryID = 'C_L7761O2U ORDER BY Name
EDIT: I've also tried J Cooper's suggestion (below), but the row numbers just aren't right.
using (var cmd = conn.CreateCommand()) {
cmd.CommandText = string.Format(#"SELECT (SELECT COUNT(*) FROM Recipes AS t2 WHERE t2.RecipeID <= t1.RecipeID) AS row_Num
FROM Recipes AS t1
WHERE RecipeID = 'FB3XSAXRWD'
ORDER BY Name";
cmd.Parameters.AddWithValue("#recipeId", id);
idx = Convert.ToInt32(cmd.ExecuteScalar());
Here is a way to get the row number in Sqlite:
SELECT CategoryID,
Name,
(SELECT COUNT(*)
FROM mytable AS t2
WHERE t2.Name <= t1.Name) AS row_Num
FROM mytable AS t1
ORDER BY Name, CategoryID;
Here's a funny trick you can use in Spatialite to get the order of values. If you use the count() function with a WHERE clause limiting to only values >= the current value, then the count will actually give the order. So if I have a point layer called "mypoints" with columns "value" and "val_order" then:
SELECT value, (
SELECT count(*) FROM mypoints AS my
WHERE my.value>=mypoints.value) AS val_order
FROM mypoints
ORDER BY value DESC;
Gives the descending order of the values.
I can update the "val_order" column this way:
UPDATE mypoints SET val_order = (
SELECT count(*) FROM mypoints AS my
WHERE my.value>=mypoints.value
);
What you are asking can be explained in two different ways, but I'm assuming you want to sort the resulting table and then number those rows according to the sort.
declare #resultrow int
select
#resultrow = row_number() OVER (ORDER BY Name Asc) as 'Row Number'
from Categories WHERE CategoryID = 'C_L776102U'
select #resultrow

Retrieve 2 last posts for each category

Lets say I have 2 tables: blog_posts and categories. Each blog post belongs to only ONE category, so there is basically a foreign key between the 2 tables here.
I would like to retrieve the 2 lasts posts from each category, is it possible to achieve this in a single request?
GROUP BY would group everything and leave me with only one row in each category. But I want 2 of them.
It would be easy to perform 1 + N query (N = number of category). First retrieve the categories. And then retrieve 2 posts from each category.
I believe it would also be quite easy to perform M queries (M = number of posts I want from each category). First query selects the first post for each category (with a group by). Second query retrieves the second post for each category. etc.
I'm just wondering if someone has a better solution for this. I don't really mind doing 1+N queries for that, but for curiosity and general SQL knowledge, it would be appreciated!
Thanks in advance to whom can help me with this.
Check out this MySQL article on how to work with the top N things in arbitrarily complex groupings; it's good stuff. You can try this:
SET #counter = 0;
SET #category = '';
SELECT
*
FROM
(
SELECT
#counter := IF(posts.category = #category, #counter + 1, 0) AS counter,
#category := posts.category,
posts.*
FROM
(
SELECT
*
FROM test
ORDER BY category, date DESC
) posts
) posts
HAVING counter < 2
SELECT p.*
FROM (
SELECT id,
COALESCE(
(
SELECT datetime
FROM posts pi
WHERE pi.category = c.id
ORDER BY
pi.category DESC, pi.datetime DESC, pi.id DESC
LIMIT 1, 1
), '1900-01-01') AS post_datetime,
COALESCE(
(
SELECT id
FROM posts pi
WHERE pi.category = c.id
ORDER BY
pi.category DESC, pi.datetime DESC, pi.id DESC
LIMIT 1, 1
), 0) AS post_id
FROM category c
) q
JOIN posts p
ON p.category <= q.id
AND p.category >= q.id
AND p.datetime >= q.post_datetime
AND (p.datetime, p.id) >= (q.post_datetime, q.post_id)
Make an index on posts (category, datetime, id) for this to be fast.
Note the p.category <= c.id AND p.category >= c.id hack: this makes MySQL to use Range checked for each record which is more index efficient.
See this article in my blog for a similar problem:
MySQL: emulating ROW_NUMBER with multiple ORDER BY conditions