Concatenating text data in a column using SQL - sql

Does anyone know how to concatenate all the values of a column in a query using SQL only? I know that there are ways using database specific tools such as pivot, but I don't think I have access to something like that in infomaker.
I am using infomaker to produce labels for sample bottles. The bottles can be analysed for more than one thing. If I join the bottle and analysis tables I get several rows per bottle which results in multiple labels, so I was hoping to concatenate all the values for the analysis using SQL and then use a computed value based on this to add something useful to the label. I can only use 1 transaction based on a select query to do this.
Adding additional tables or columns to the database would be highly discouraged.

In some oracle version you can use wm_concat
SELECT field1, wm_concat(field2) FROM YourTable
GROUP BY field2;
otherwise you can use listagg

Related

Reason for not being able to use DISTINCT more than once in SQL

I know that you can use DISTINCT with multiple columns and not use it more than once in a single SQL query, but what is the reason for this ? I could think of the case that the output of SQL is rows and applying distinct on 2 columns seperately would result in a conflict of which rows to output but is there any more to it? Perhaps someone could clarify this.
This is common for many SQL implementations(I have tried MySQL, PostgreSQL and SQLite). An example query would be like
SELECT DISTINCT(column1), DISTINCT(column2) FROM table;

Indefinite amount of column in Pivot

I have created a function with PIVOT that turns the string into table which is the output I want. What I want is I don't need to declare the columns one by one.
Here's the code:
SELECT
Name1, Name2, Name3
FROM
(
SELECT
LEFT(CA.val,CHARINDEX('=',CA.val)-1) ColumnName,
SUBSTRING(CA.val,CHARINDEX('=',CA.val)+1,100) Value
FROM Z_SampleTable
CROSS APPLY dbo.SplitSample(ProcessBody,'|') CA
) PD
PIVOT
(
MAX(Value)
FOR ColumnName IN (Name1, Name2, Name3)
)
AS PT
Is there a way that doesn't need to declare each column? For example, I have a table that contains 100 columns, is there a way not to declare those 100 columns?
With SQL based solutions you're required to use a dynamic SQL solution. Build your SELECT query dynamically, crafting the PIVOT bit yourself (beware of injection attacks!), and then using EXEC(#dynamicSql) to get the results. Something like this answer to another question.
I've done a fair share of research into this, and if you're willing to go outside the realm of plain SQL queries you'll start to get more options. SSRS for example is pretty good at dynamically pivotting normalized datasets as long as they're small. For bigger scenarios while staying with SQL you'll need bigger guns, all requiring being "smart" about your query's (i.e. building them dynamically).
Finally, for bigger scenario's you can use polyglot persistance, and create a replica of your data in a schemaless NoSQL solution. These typically allow you to do this with ease.
But the bottom line, for ad hoc SQL queries: you'll have to build the FOR ColName IN (...) bit dynamically and then EXEC(#myQuery).

Compare items in comma separated list to another set of items in a comma separated list using limited Oracle SQL

I would like to write a simple Oracle SQL 10g statement to compare the items (strings) in an arbitrarily defined comma separated list to a field containing a list of comma separated strings. The items in the defined list can appear in the field in any order, and exact matches need to be found (not substrings).
I have a working solution using a series of regexp_like() statements entered manually, but I need to hand this off to a client who will be maintaining this moving forward, and would like to be able to just update the comma separated string directly.
I also have some software GUI based limitations on what I can do with Oracle SQL to accomplish this. Specifically I can not use any PL/SQL and this must be written in a single select statement (no temporary tables or anything fun/useful like that.) I've found a number of solutions to what I'm trying to accomplish, but almost all depended on being able to write custom functions.
So, now that backstory/limitations are out of the way, let's get down to the nitty gritty.
Example arbitrary (client-provided) list: ItemA,ItemB,ItemC
Table ITEMS:
Column Items (varchar2 of some arbitrary but sufficient length)
ItemA,ItemB,ItemC
ItemC,ItemB,ItemA
ItemX,ItemC,ItemY,ItemA,ItemB,ItemB
ItemX,ItemY,ItemC
I want a single select statement that will basically select all rows where Items contains "ItemA" and "ItemB" and "ItemC", but without having to break that string up manually. In this case, it would match the first, second and third row, but not the fourth row.
(EDIT) I realize this table structure is badly designed. At this time I do not know if we can go back to the client to fix this as the data may be used as is elsewhere already, making a redesign would be costly and time consuming. I'm sure a lot of you are used to this scenario. The initial system was designed poorly, now I've been brought in to consult on difficulties arising from the poor design. Let's assume it is not possible to normalize this table, and must be used as is.
It is entirely possible that what I would like to do is simply not possible given the limitations of the interface I need to use, but my SQL knowledge is not great enough to determine that.
Thank you all very much for taking the time to read this question. Please let me know if anything is confusing or needs expansion or clarification.
While I absolutely agree with commentators that the data model is flawed, sometimes you have to work with what you're given. If it is really impossible to change the data model then you can do this, but it isn't entirely pretty, and depends on your 'restrictions' not excluding the use of common table expressions - I've seen tools struggle with those...
with items_cte as (
select id, regexp_substr(items, '[^,]+', 1, level) as item, level as pos
from items
connect by level <= regexp_count(items, '[^,]+')
and prior id = id
and prior sys_guid() is not null
),
list_cte as (
select regexp_substr(:list, '[^,]+', 1, level) as item,
count(*) over () as list_length
from dual
connect by level <= regexp_count(:list, '[^,]+')
)
select i.id, listagg(i.item, ',') within group (order by i.pos) as items
from items_cte i
join list_cte l
on l.item = i.item
group by i.id
having count(distinct i.item) = max(l.list_length)
order by i.id;
ID ITEMS
---------- --------------------------------------------
1 ItemA,ItemB,ItemC
2 ItemC,ItemB,ItemA
3 ItemC,ItemA,ItemB,ItemB
SQL Fiddle.
This is using two common table expressions (CTEs, also known as subquery factoring). They each split a comma-separated list up into pseudorows. The list breakdown is fairly simple and uses regex functions you seem to be familiar with. The items one is a bit more complicated because the connect by clause doesn't generally work very well with multiple rows. This uses a trick that uses the prior clause with any non-deterministic function - sys_guid() here but you can use others - to stop it cycling and mixing values from different original rows. I've also assumed you have a unique ID column on the table.
The Fiddle shows the two separate split results, as well as the final result from joining those.
Finally listagg is used to put the split values back together in their original orders, and the count(distinct i.item) check only shows results where all the values from the list were matched. The distinct is needed to match your third row, since itemB appears twice.

Select all values in a column into a string in SQLite?

In SQL Server, I can do this with the help of Cursor (to loop through each row in the result set and build my string). But I don't know how to do the same in SQLite, this task is not like the so-called Pivoting which as far as I know, we have to know the exact rows we want to turn to columns. My case is different, I want to select all the values of a specified column into a string (and then can select as a column of 1 row), this column can have various values depend on the SELECT query. Here is a sample of how it looks:
A | B
------------
1 | 0
8 | 1
3 | 2
... ....
I want to select all the values of the column A into a string like this "183..." (or "1,8,3,..." would be fine).
This can be used as a column in another SELECT, I have to implement this because I need to display all the sub-infos (on each row of column A) as a comma-separated list in another row.
I don't have any idea on this, it seems to need some procedural statements (such as placed in a procedure) but SQLite limits much on how I can do with loop, variable declaration,...I'm really stuck. This kind of task is very common when programming database and there is no reason for me to refuse doing it.
Please help, your help would be highly appreciated! Thanks!
If you're just trying to get all the values from Column A into a single record, then use GROUP_CONCAT:
select group_concat(a, ',')
from yourtable
SQL Fiddle Demo
The main problem is that you are think like an application programmer. SQL does a lot of things for you.
SELECT *
FROM tableA
WHERE A IN (SELECT A
FROM tableB)
No need to resort to cursors, stored procedures and multiple queries.

MySQL - Selecting data from multiple tables all with same structure but different data

Ok, here is my dilemma I have a database set up with about 5 tables all with the exact same data structure. The data is separated in this manner for localization purposes and to split up a total of about 4.5 million records.
A majority of the time only one table is needed and all is well. However, sometimes data is needed from 2 or more of the tables and it needs to be sorted by a user defined column. This is where I am having problems.
data columns:
id, band_name, song_name, album_name, genre
MySQL statment:
SELECT * from us_music, de_music where `genre` = 'punk'
MySQL spits out this error:
#1052 - Column 'genre' in where clause is ambiguous
Obviously, I am doing this wrong. Anyone care to shed some light on this for me?
I think you're looking for the UNION clause, a la
(SELECT * from us_music where `genre` = 'punk')
UNION
(SELECT * from de_music where `genre` = 'punk')
It sounds like you'd be happer with a single table. The five having the same schema, and sometimes needing to be presented as if they came from one table point to putting it all in one table.
Add a new column which can be used to distinguish among the five languages (I'm assuming it's language that is different among the tables since you said it was for localization). Don't worry about having 4.5 million records. Any real database can handle that size no problem. Add the correct indexes, and you'll have no trouble dealing with them as a single table.
Any of the above answers are valid, or an alternative way is to expand the table name to include the database name as well - eg:
SELECT * from us_music, de_music where `us_music.genre` = 'punk' AND `de_music.genre` = 'punk'
The column is ambiguous because it appears in both tables you would need to specify the where (or sort) field fully such as us_music.genre or de_music.genre but you'd usually specify two tables if you were then going to join them together in some fashion. The structure your dealing with is occasionally referred to as a partitioned table although it's usually done to separate the dataset into distinct files as well rather than to just split the dataset arbitrarily. If you're in charge of the database structure and there's no good reason to partition the data then I'd build one big table with an extra "origin" field that contains a country code but you're probably doing it for legitimate performance reason.
Either use a union to join the tables you're interested in http://dev.mysql.com/doc/refman/5.0/en/union.html or by using the Merge database engine http://dev.mysql.com/doc/refman/5.1/en/merge-storage-engine.html.
Your original attempt to span both tables creates an implicit JOIN. This is frowned upon by most experienced SQL programmers because it separates the tables to be combined with the condition of how.
The UNION is a good solution for the tables as they are, but there should be no reason they can't be put into the one table with decent indexing. I've seen adding the correct index to a large table increase query speed by three orders of magnitude.
The union statement cause a deal time in huge data. It is good to perform the select in 2 steps:
select the id
then select the main table with it