SQL select id=1 - sql

I've a table that has id_categoria field having comma separated value, e.g., 1,2,3,4,64,31,12,14, because a record can belong to multiple categories. If I want to select records that belongs to category 1, I have to run following SQL query
SELECT *
FROM cme_notizie
WHERE id_categoria LIKE '1%'
ORDER BY `id` ASC
and then select all records from the record set that have id_categoria exactly 1 in id_categoria. Let's assume that the value 1 does not exist, but column value like 12, 15, 120 ... still contains 1.
There is a way to take only 1? without taking derivatives or other?

As comments say, you probably shouldn't do that. Instead, you should have another table with one row per category. But if you decide to go with this inferior solution, you can do the following:
SELECT *
FROM cme_notizie
WHERE CONCAT(',', id_categoria, ',') LIKE '%,1,%'
ORDER BY id ASC

Related

How to use LIMIT and IN together to have a default row in SQL?

I am exploring SQL with W3School page and I have this requirements where I need to limit the query to a certain number but also having a default row included with that limit.
Here I want a default row where the customer name is Alfreds, then grab the remaining 29 rows to complete the query regardless of what their name is.
I tried to look on other SO question but they are too complicated to understand and using different syntax.
What you are looking for is a specific order clause.
Try this
SELECT * FROM Customers order by (case when CustomerName in ('Alfreds Futterkiste') then 0 else CustomerId end) limit 30 ;
If you're going to have a default row in SQL you should really have that row in the table with a known primary key, and then UNION it onto your query:
--default row, that is always included as long as the table has a PK 1
SELECT *
FROM Customers
WHERE CustomerId = 1
UNION ALL
--other rows, a variable number of
SELECT *
FROM Customers
WHERE CustomerId <> 1 AND ...
LIMIT 30
The limit presented in this way applies to the result of the Union
If you ever want to do something where you're unioning together limited sets in other combinations you might want to look at eg a form like
(... LIMIT 2)
UNION ALL
(... LIMIT 28)
Use UNION to combine the two queries.
SELECT *
FROM Customers
WHERE CustomerName != 'Alfredo Futterkiste'
LIMIT 9
UNION
SELECT *
FROM Customers
WHERE CustomerName = 'Alfreo Futterkiste'

Group by question in SQL Server, migration from MySQL

Failed finding a solution to my problem, would love your help.
~~ Post has been edited to have only one question ~~-
Group by one query while selecting multiple columns.
In MySQL you can simply group by whatever you want, and it will still select all of them, so if for example I wanted to select the newest 100 transactions, grouped by Email (only get the last transaction of a single email)
In MySQL I would do that:
SELECT * FROM db.transactionlog
group by Email
order by TransactionLogId desc
LIMIT 100;
In SQL Server its not possible, googling a bit suggested to specify each column that I want to have with an aggregate as a hack, that couldn't cause a mix of values (mixing columns between the grouped rows)?
For example:
SELECT TOP(100)
Email,
MAX(ResultCode) as 'ResultCode',
MAX(Amount) as 'Amount',
MAX(TransactionLogId) as 'TransactionLogId'
FROM [db].[dbo].[transactionlog]
group by Email
order by TransactionLogId desc
TransactionLogId is the primarykey which is identity , ordering by it to achieve the last inserted.
Just want to know that the ResultCode and Amount that I'll get doing such query will be of the last inserted row, and not the highest of the grouped rows or w/e.
~Edit~
Sample data -
row1:
Email : test#email.com
ResultCode : 100
Amount : 27
TransactionLogId : 1
row2:
Email: test#email.com
ResultCode:50
Amount: 10
TransactionLogId: 2
Using the sample data above, my goal is to get the row details of
TransactionLogId = 2.
but what actual happens is that I get a mixed values of the two, as I do get transactionLogId = 2, but the resultcode and amount of the first row.
How do I avoid that?
Thanks.
You should first find out which is the latest transaction log by each email, then join back against the same table to retrieve the full record:
;WITH MaxTransactionByEmail AS
(
SELECT
Email,
MAX(TransactionLogId) as LatestTransactionLogId
FROM
[db].[dbo].[transactionlog]
group by
Email
)
SELECT
T.*
FROM
[db].[dbo].[transactionlog] AS T
INNER JOIN MaxTransactionByEmail AS M ON T.TransactionLogId = M.LatestTransactionLogId
You are currently getting mixed results because your aggregate functions like MAX() is considering all rows that correspond to a particular value of Email. So the MAX() value for the Amount column between values 10 and 27 is 27, even if the transaction log id is lower.
Another solution is using a ROW_NUMBER() window function to get a row-ranking by each Email, then just picking the first row:
;WITH TransactionsRanking AS
(
SELECT
T.*,
MostRecentTransactionLogRanking = ROW_NUMBER() OVER (
PARTITION BY
T.Email -- Start a different ranking for each different value of Email
ORDER BY
T.TransactionLogId DESC) -- Order the rows by the TransactionLogID descending
FROM
[db].[dbo].[transactionlog] AS T
)
SELECT
T.*
FROM
TransactionsRanking AS T
WHERE
T.MostRecentTransactionLogRanking = 1

Get distinct information across many fields some of which are NULL

I have a table with just over 65 million rows and 140 columns. The data comes from several sources and is submitted at least every month.
I look for a quick way to grab specific fields from this data only where they are unique. Thing is, I want to process all the information to link which invoice was sent with which identifying numbers and it was sent by whom. Issue is, I don't want to iterate over 65 million records. If I can get distinct values, then I will only have to process say 5 million records as opposed to 65 million. See below for a description of the data and SQL Fiddle for a sample
If say a client submits an invoice_number linked to passport_number_1, national_identity_number_1 and driving_license_1 every month, I only want one row where this appears. i.e. the 4 fields have got to be unique
If they submit the above for 30 months then on the 31st month they send the invoice_number linked to passport_number_1, national_identity_number_2 and driving_license_1, I want to pick this row also since the national_identity field is new hence the whole row is unique
By linked to I mean they appear on the same row
For all fields its possible to have Null occurring at one point.
The 'pivot/composite' columns are the invoice_number and
submitted_by. If any of those aren't there, drop that row
I also need to include the database_id with the above data. i.e.
the primary_id which is auto generated by the postgresql database
The only fields that don't need to be returned are the other_column
and yet_another_column. Remember the table has 140 columns so don't
need them
With the results, create a new table that will hold this unique
records
See this SQL fiddle for an attempt to recreate the scenario.
From that fiddle, I'd expect a result like:
Row 1, 2 & Row 11: Only one of them shall be kept as they are exactly the
same. Preferably the row with the smallest id.
Row 4 and Row 9: One of them would be dropped as they are exactly the
same.
Row 5, 7, & 8: Would be dropped since they are missing either the
invoice_number or submitted_by.
The result would then have Row (1, 2 or 11), 3, (4 or 9), 6 and 10.
To get one representative row (with additional fields) from a group with the four distinct fields:
SELECT
distinct on (
invoice_number
, passport_number
, national_id_number
, driving_license_number
)
* -- specify the columns you want here
FROM my_table
where invoice_number is not null
and submitted_by is not null
;
Note that it is unpredictable which row exactly is returned unless you specify an ordering (documentation on distinct)
Edit:
To order this result by id simply adding order by id to the end doesn't work, but it can be done by eiter using a CTE
with distinct_rows as (
SELECT
distinct on (
invoice_number
, passport_number
, national_id_number
, driving_license_number
-- ...
)
* -- specify the columns you want here
FROM my_table
where invoice_number is not null
and submitted_by is not null
)
select *
from distinct_rows
order by id;
or making the original query a subquery
select *
from (
SELECT
distinct on (
invoice_number
, passport_number
, national_id_number
, driving_license_number
-- ...
)
* -- specify the columns you want here
FROM my_table
where invoice_number is not null
and submitted_by is not null
) t
order by id;
quick way to grab specific fields from this data only where they are unique
I don't think so. I think you mean you want to select a distinct set of rows from a table in which they are not unique.
As far as I can tell from your description, you simply want
SELECT distinct invoice_number, passport_number,
driving_license_number, national_id_number
FROM my_table
where invoice_number is not null
and submitted_by is not null;
In your SQLFiddle example, that produces 5 rows.

SQL Server Sum multiple rows into one - no temp table

I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.

Last id value in a table. SQL Server

Is there a way to know the last nth id field of a table, without scanning it completely? (just go to the end of table and get id value)
table
id fieldvalue
1 2323
2 4645
3 556
... ...
100000000 1232
So for example here n = 100000000 100 Million
--------------EDIT-----
So which one of the queries proposed would be more efficient?
SELECT MAX(id) FROM <tablename>
Assuming ID is the IDENTITY for the table, you could use SELECT IDENT_CURRENT('TABLE NAME').
See here for more info.
One thing to note about this approach: If you have INSERTs that fail but increment the IDENTITY counter, then you will get back a result that is higher than the result returned by SELECT MAX(id) FROM <tablename>
You can also use system tables to get all last values from all identity columns in system:
select
OBJECT_NAME(object_id) + '.' + name as col_name
, last_value
from
sys.identity_columns
order by last_value desc
In case when table1 rows are inserted first, and then rows to table2 which depend on ids from the table1, you can use SELECT:
INSERT INTO `table2` (`some_id`, `some_value`)
VALUES ((SELECT some_id
FROM `table1`
WHERE `other_key_1` = 'xxx'
AND `other_key_2` = 'yyy'),
'some value abc abc 123 123 ...');
Of course, this can work only if there are other identifiers that can uniquely identify rows from table1
First of all, you want to access the table in DESCENDING order by ID.
Then you would select the TOP N records.
At this point, you want the last record of the set which hopefully is obvious. Assuming that the id field is indexed, this would at most retrieve the last N records of the table and most likely would end up being optimized into a single record fetch.
Select Ident_Current('Your Table Name') gives the last Id of your table.