I am using SQL and PL/SQL.
I have a table with a single column "name". The data in this column is a semicolon separated string. I want to count the number of elements in each row.
For example, if there are two rows in the table, one with the string 'smith;black;tiger'and one with the string 'x;y', I want the result of the query to be 3 for the first row and 2 for the second row.
How can I write a SQL query that will count the number of elements in a separated list of values?
Your question is very hard to understand, but I think you are saying that you have a table with a column "name", and inside that column, each cell contains multiple values separated by semicolons. You want to count the number of semicolon-separated values in each row.
The PLSQL string functions could help you here, but really you are talking about writing program code to do this task (this isn't normally the job for a database query language). For example, you could use this trick to count the number of semicolons, and then add one:
LENGTH(name) - LENGTH(TRANSLATE(name,'x;','x')) + 1
But the point #Johan is making is that this is a bad way to structure your data. For example, it makes it impossible to lookup by name properly. You can't say where name == "smith", because the name doesn't equal "smith", it equals "smith;black;tiger". You'll have to do a substring search, saying where name like "smith", and that will be inefficient and wrong. What if I ask to look up the name "smi"? You'll say where name like "smi" which will incorrectly find that result -- there is no way to say that "smith" is in the table but "smi" is not, because both are substrings of "smith;black;tiger".
A much better solution, if you want an entry to have multiple names, is to give the entry a unique ID; say 123. (You can ask SQL to automatically generate unique IDs for table rows.) Then, have a separate table for names which maps back onto that row. So say you gave the "smith;black;tiger" row the ID 123 and the "x;y" row the ID 124. Now you would have another table NameMap with two columns:
name | entry
------------------
smith | 123
black | 123
tiger | 123
x | 124
y | 124
Now you can look up names in that table and map them onto the name entries table with a join.
And you'll be able to answer your question: "how many names correspond to each row of the entry table" with a GROUP BY statement like this:
SELECT name, count(*) as value_count FROM NameMap GROUP BY name
You can hack this using this code
SQL code horror
SELECT name
,(LENGTH(COALESCE(vals,''))
- LENGTH(TRANSLATE(COALESCE(vals,''), ';',''))) + 1 AS value_count
FROM table1
ORDER BY name
Remarks
CSV in a database is a very bad anti-pattern.
VALUES is a reserved word, it's a bad idea to name a column after a reserved word.
Related
Pretend there is a database with every city in the world and a unique ID to go with each one. I have a list of 50 IDs in one column of an excel document.
How do I return the names of the 50 cities most efficiently? Do I really need to do a WHERE clause with ID# OR ID# ....etc?
You just do a in
select cityname from tableofeverycity a where id in (select id from tbl50 ids)
In SQL terms you would be looking for something like
SELECT names FROM tablofcities
This code assumes that you want them in the same order that they are listed in.
Because you mention that these are listed in a excel document rather than a database, it makes it a little different.
I'd recommend checking out the following link for more details on your question.
How to run a SQL query on an Excel table?
In Excel, create a formula that is essentially (assuming the empty cell is A1):
empty
1 =B1&","&A2
2 =B2&","&A3
3 =B3&","&A4
You can write the formula once and copy it down.
The results will look like:
1 ,1
2 ,1,2
3 ,1,2,3
Go to the 50th row and copy the formula.
Next, paste them into a query in your favorite GUI:
select c.*
from cities c
where cityid in (<paste list here>);
Remove the first comma.
Run the query.
I have two tables in oracle as per below and need mapping data from these two tables from stored procedure or in c# code for .net core. Both will work for me.
First Table contains data which is in key-value form where "Key" is the ID of the Second Table. Value is the actual data required.
First table :
ID Data
1 {"f100000":["02/02/2012"],"f100001":["01/04/2013"]}
So, "f100000", "f100001"... etc are the keys and ID for Second table
Second table has simple data with ID and Name
ID Name
f100000 Name of the field
f100001 Name of the field2
I would expect the result will be as per below:
Key Value
Name of the field 02/02/2012
Name of the field2 01/04/2013
This is not going to happen in current form without extraneous code which is redundant/inefficient when database design could be improved.
Can the design of table 1 not be improved i.e. have separate fields for f###### values and for the corresponding date.
That way the f###### values can be indexed so that joins will run efficiently between the two tables.
This amendment would need to be made in the code which inserts records into table.
If not, you would have to:
Select rows from table1.
Split the string into array based on ',' character
Split of each of these array values into another 2 dimension array based on ':' character when looping through first array
Strip out '[', ']' and '"' characters from date field for it to be parseable
As looping through, then SELECT from table_2.id = value from second array[0] value
Print out results line by line
As this would need to be done each time code is run, it is very efficient. Much better to redesign table 1 and add in logic to insert as required.
Here is my question: I have a SQL table called translation. And for simplicity, we got three columns id, ENG and FR. id is the primary key. In the database, I noticed two records with the same ENG text value "First Name". I removed all trailing spaces using LTRIM and RTRIM but when I did
SELECT ENG, COUNT(*)
FROM translation
GROUP BY ENG
HAVING COUNT(*) > 1
it returned zero records. Also, when I do
select *
from translation
where ENG = "First Name"
I only get one record returned, instead of two. My task was to remove duplicate records and clearly those two should be merged together. So I was wondering is there a way to compare text fields and spot the differences OR is there a way to merge all text fields with some LETTERS into one because I don't want to do update statement for all cases like this. Thanks in advance!
Note: columns look a bit different but there's my point
I have an indexed (nonclustered) string column (let's just call it 'Identifier') on a table with the following row values:
`0000001`
`0000245`
`001`
`AB0001`
I want to be able to efficiently return all the rows that have an Identifier ending with a certain number entered by the user. For example, when the user enters 1 then the following rows should be returned:
0000001
001
AB0001
The problem is that using WHERE Identifier LIKE CONCAT(N'%', #UserInput) uses an index scan which doesn't scale well, since the table has tons of rows in it (many millions)
How should I efficiently query this data? My first thought is to add a new column that represents the REVERSE() of the Identifier column, and then use WHERE ReversedIdentifier LIKE CONCAT(REVERSE(#UserInput), N'%') to find the matches using a "starts with"
This doesn't seem like the cleanest solution, but it's all I can think of at the moment. Is there a better way?
If you have a column that holds the number component and that column is a number and used that column in an index ... that would be a lot faster.
I have a table (a) that contains imported data, and one of the values in that table needs to be joined to another table (b) based on that value. In table b, sometimes that value is in a comma separated list, and it is stored as a varchar. This is the first time I have dealt with a database column that contains multiple pieces of data. I didn't design it, and I don't believe it can be changed, although, I believe it should be changed.
For example:
Table a:
column_1
12345
67890
24680
13579
Table b:
column_1
12345,24680
24680,67890
13579
13579,24680
So I am trying to join these table together, based on this number and 2 others, but when I run my query, I'm only getting the one that contain 13579, and none of the rest.
Any ideas how to accomplish this?
Storing lists as a comma delimited data structure is a sign of bad design, particularly when storing ids, which are presumably an integer in their native format.
Sometimes, this is necessary. Here is a method:
select *
from a join
b
on ','+b.column_1+',' like '%,'+cast(a.column_1 as varchar(255))+',%'
This will not perform particularly well, because the query will not take advantage of any indexes.
The idea is to put the delimiter (,) at the beginning and end of b.column_1. Every value in the column then has a comma before and after. Then, you can search for the match in a.column_1 with commas appended. The commas ensure that 10 does not match 100.
If possible, you should consider an alternative way to represent the data. If you know there are at most two values, you might consider having two columns in a. In general, though, you would have a "join" table, with a separate row for each pair.