Query not reading the quoted string values stored in the table - sql

I have stored some quoted values in a separate table and based on the value in this table. I am trying to filter the rows in another table
by using the values in this table in a subquery. But it is not reading the values for the subquery and returns a blank table in output.
The value is in column override and resolves to 'HCC11','HCC12'.
When I just copy the value from the column and paste it in place of the subquery it is fetching the data correctly. I am not able to understand the issue here. I have tried using the trim() function here but still its not working
Note-: I have attached the pic for your reference:
select *
from table1
where column1 in (select override from table 2 )

Storing comma separated values in a single column is a really poor database to begin with enclosing them in quotes makes things even wors. The proper solution to your problem is a better design.
However, if you are forced to work with that bad design, you can convert them to a proper list of values using
select *
from table1
where column1 in (select trim(both '''' from w.word)
from table2 t2
cross join unnest(string_to_array(t2.override, ',')) as w(word)
This assumes that table1.column1 only contains a single value without any quotes and that the override values never contain a comma in the real value (e.g. the above would break on a value like 'A,B', 'C')

You have the override column value as 'HCC11','HCC12' which can not match with single value 'HCC11'. You should better use the LIKE operator as follows:
select * from table1 t1
where exists
(select 1 from table2 t2
where t2.override like concat('%''', t1.column1, '''%'));

According to your image, the value of table1.column1 has to be 'HCC11','HCC12' (one string) to get the match from subquery.
If the table1 has 2 rows with values HCC11 and HCC12 then you might use the exists keyword in your subquery.
Something like
select *
from table1 t1
where exists
(select 1
from table2 t2
where instr( t2.override, concat("'",t1.column1,"'") ) >=1
);

You can do this like -
1.
select * from table1
where column1 in
(select regexp_replace(unnest(string_to_array(override, ',')),'''', '', 'g') from table2)
Or
2.
select * from table1
where '''' || column1 || '''' in
(select unnest(string_to_array(override, ',')) from table2)
Although, I would just recommend not storing your data like this, since you want to query using it.

Related

How can I merge 2 partially overlapping strings using Apache Hive?

I have a field which holds a short list of ids of a fixed length.
e.g. aab:aac:ada:afg
The field is intended to hold at most 5 ids, growing gradually. I update it by adding from a similarly constructed field that may partially overlap with my existing set, e.g. ada:afg:fda:kfc.
The field expans when joined to an "update" table, as in the following example.
Here, id_list is the aforementioned list I want to "merge", and table_update is a table with new values I want to "merge" into table1.
insert overwrite table table1
select
id,
field1,
field2,
case
when (some condition) then a.id_list
else merge(a.id_list, b.id_list)
end as id_list
from table1 a
left join
table_update b
on a.id = b.id;
I'd like to produce a combined field with the following value:
aab:aac:ada:afg:fda.
The challenge is that I don't know whether or how much overlap the strings have until execution, and I cannot run any external code, or create UDFs.
Any suggestions how I could approach this?
Split to get arrays, explode them, select existing union all new, aggregate using collect_set, it will produce unique array, concatenate array into string using concat_ws(). Not tested:
select concat_ws(':',collect_set(id))
from
(
select explode(split('aab:aac:ada:afg',':')) as id --existing
union all
select explode(split('ada:afg:fda:kfc',':')) as id --new
);
You can use UNION instead UNION ALL to get distinct values before aggregating into array. Or you can join new and existing and concatenate strings into one, then do the same:
select concat_ws(':',collect_set(id))
from
(
select explode(split(concat('aab:aac:ada:afg',':','ada:afg:fda:kfc'),':')) as id --existing+new
);
Most probably you will need to use lateral view with explode in the real query. See this answer about lateral view usage
Update:
insert overwrite table table1
select concat_ws(':',collect_set(a.idl)) as id_list,
id,
field1,
field2
from
(
select
id,
field1,
field2,
split(
case
when (some condition) then a.id_list
when b.id_list is null then a.id_list
else concat(a.id_list,':',b.id_list)
end,':') as id_list_array
from table1 a
left join table_update b on a.id = b.id
)s
LATERAL VIEW OUTER explode(id_list_array ) a AS idl
group by
id,
field1,
field2
;

A regular expression to check if a column contains another column value

I have a specific scenarios which is stated like below :
Table T1 contains Name and status. Table T2 contains column Name_status. status will have values of pass and fail. Name_status should have values like <Name>_<status>ed.
Can we think of a regular expression which would fetch all the values in <Name>_<status>ed format.
Any help would be appreciated. Thanks
You may try this
SELECT *
FROM T1, T2
WHERE REGEXP_REPLACE(T2.Name_Status, '^(.+?)_(.+?)ed$', '\1') = T1.name
AND REGEXP_REPLACE(T2.Name_Status, '^(.+?)_(.+?)ed$', '\2') = T1.status
This would work out fine
SELECT name_status
FROM(SELECT * FROM tbl1, tbl2 WHERE tbl2.name_status
REGEXP concat(tbl1.name,"_",tbl1.status)) as temp

Displaying a delimited string for values that are not in the GROUP BY clause

When I have an SQL query with a GROUP BY clause, It is often very useful to see some of the un-grouped values for easier debugging.
My question is, how can I select a string that will be composed of the un-grouped values.
For example, in the following code:
SELECT t2.ID
--,t1.Id -- < how can I display this as a comma seperated string
FROM t1
INNER JOIN t2 on t1.t2ID = t2.ID
GROUP BY t2.ID
I would like to have a way to select a string with t1.Id's for each grouped record (e.g. "42, 13, 18"...).
How can I achieve that?
Assuming these are integer values, you can use a naked XML PATH transform to handle group concatenation for you (and this even supports predictable and well-defined order, unlike all other group concatenation methods - which have undefined behavior).
DECLARE #t2 TABLE(ID INT);
DECLARE #t1 TABLE(ID INT IDENTITY(1,1),t2ID INT);
INSERT #t2(ID) VALUES(1),(2),(3);
INSERT #t1(t2ID) VALUES(1),(1),(1),(2);
SELECT t2.ID, t2IDs = STUFF((
SELECT ',' + CONVERT(VARCHAR(11), t1.ID)
FROM #t1 AS t1 WHERE t1.t2ID = t2.ID
ORDER BY t1.ID
FOR XML PATH('')),1,1,'')
FROM #t2 AS t2;
Results:
ID t2IDs
---- -----
1 1,2,3
2 4
3 NULL
Note that you don't need ID in the GROUP BY clause, because you're no longer needing to filter out duplicates matched by virtue of the JOIN. Of course this assumes your column is named appropriately - if that column has duplicates with no JOIN involved at all, then it has a terribly misleading name. A column named ID should uniquely identify a row (but even better would be to call it what it is, and name it the same throughout the model, e.g. CustomerID, OrderID, PatientID, etc).
If you're dealing with strings, you need to account for cases where the string may contain XML-unsafe characters (e.g. <). In those cases, this is the method I've always used:
FOR XML PATH(''), TYPE).value(N'./text()[1]',N'nvarchar(max)'),1,1,'')

I am trying to return a certain values in each row which depend on whether different values in that row are already in a different table

I'm still a n00b at SQL and am running into a snag. What I have is an initial selection of certain IDs into a temp table based upon certain conditions:
SELECT DISTINCT ID
INTO #TEMPTABLE
FROM ICC
WHERE ICC_Code = 1 AND ICC_State = 'CA'
Later in the query I SELECT a different and much longer listing of IDs along with other data from other tables. That SELECT is about 20 columns wide and is my result set. What I would like to be able to do is add an extra column to that result set with each value of that column either TRUE or FALSE. If the ID in the row is in #TEMPTABLE the value of the additional column should read TRUE. If not, FALSE. This way the added column will ready TRUE or FALSE on each row, depending on if the ID in each row is in #TEMPTABLE.
The second SELECT would be something like:
SELECT ID,
ColumnA,
ColumnB,
...
NEWCOLUMN
FROM ...
NEWCOLUMN's value for each row would depend on whether the ID in that row returned is in #TEMPTABLE.
Does anyone have any advice here?
Thank you,
Matt
If you left join to the #TEMPTABLE you'll get a NULL where the ID's don't exist
SELECT ID,
ColumnA,
ColumnB,
...
T.ID IS NOT NULL AS NEWCOLUMN -- Gives 1 or 0 or True/false as a bit
FROM ... X
LEFT JOIN #TEMPTABLE T
ON T.ID = X.ID -- DEFINE how the two rows can be related unquiley
You need to LEFT JOIN your results query to #TEMPTABLE ON ID, this will give you the ID if there is one and NULL if there isn't, if you want 1 or 0 this would do it (For SQL Server) ISNULL(#TEMPTABLE.ID,0)<>0.
A few notes on coding for performance:
By definition an ID column is unique so the DISTINCT is redundant and causes unnecisary processing (unless it is an ID from another table)
Why would you store this to a temporary table rather than just using it in the query directly?
You could use a union and a subquery.
Select . . . . , 'TRUE'
From . . .
Where ID in
(Select id FROM #temptable)
UNION
SELECT . . . , 'FALSE'
FROM . . .
WHERE ID NOT in
(Select id FROM #temptable)
So the top part, SELECT ... FROM ... WHERE ID IN (Subquery), does a SELECT if the ID is in your temptable.
The bottom part does a SELECT if the ID is not in the temptable.
The UNION operator joins the two results nicely, since both SELECT statements will return the same number of columns.
To expand on what someone else was saying with Union, just do something like so
SELECT id, TRUE AS myColumn FROM `table1`
UNION
SELECT id, FALSE AS myColumn FROM `table2`

SQL: pull distinct values form 1 column with all values from 2nd column

Its easier to explain what I need to do with an example;
table looks like this
Col 1, Col 2
1, a
1, b
2, a
2, b
2, c
I need a query to return something like
1,a,b
2,a,b,c
You would want a line such as:
UPDATE t
SET t.dupcustodians = dt.custadmin
FROM tbldoc t
INNER JOIN (SELECT t1._dupid,
(SELECT DISTINCT custadmin + ', '
FROM tbldoc t2
WHERE t2._dupid = t1._dupid
ORDER BY custadmin + ', '
FOR XML PATH('')) AS custadmin
FROM tbldoc t1
GROUP BY _dupid) AS dt
ON t._dupid = dt._dupid
;
I had a similar problem where everything had a name in the "CustAdmin" field and then they all had potentially duplicate _DupID values. I wanted it to list out in a new field "DupCustodians" all the names that were there when the _DupID values were alike from one record to the next. So swap those names with the field names you need (and don't forget to change the table names, of course) and you should be good.
Well, if you are using MySQL, then you can do this:
SELECT Col1, GROUP_CONCAT(Col2)
FROM MyTable
GROUP BY Col1
Other databases that don't have the MySQL specific GROUP_CONCAT function might require a more complex query.