Lets say I have a database A with tables B1 and B2.
B1 has columns C1 and C2
and B2 has columns D1, D2 and D3.
I am looking for an Impala query that yields the following desired output:
B1 | "C1+C2"
B2 | "D1+D2+D3"
where "D1+D2+D3" and "C1+C2" are concatenated strings.
Do you want the concatenated columns in a new table? Or do you want to add the concatenated columns to your existing tables? Either way, you can use the code below in impala to concatenated columns:
SELECT
CONCAT(C1,C2) AS concat_fields
, "B1" AS table_name
FROM B1
UNION
SELECT
CONCAT(D1,D2,D3) AS concat_fields
, "B2" AS table_name
FROM B2
Related
I need to compare a value in one column to all the values in the same column, to search for a pattern obtained from each value among all the others.
For example, in this table, I need to search for the xxx, yyy, zzz paterns from the first Order ID value in all the other Order IDs to find it and create a new column replacing the actual value for the pattern.
Suppliers
Order ID
A1
xxx
A1
00xxx
A1
xxx0
A1
200xx
A2
yyy
A2
01yyy0
A2
45yyy
A3
45zzz
Obtaining
Suppliers
Order ID
Order ID OK
A1
xxx
xxx
A1
00xxx
xxx
A1
xxx0
xxx
A1
200xx
200xx
A2
yyy
yyy
A2
01yyy0
yyy
A2
45yyy
yyy
A3
45zzz
zzz
I have a problem with the runtime here, because as the table increases, the search space does as well and I end up with an exponential increase in runtime, so I just need to run the search logic *for each supplier (this is where the other column Suppliers needs to play a role), that is,
search all the values for Supplier A1 and replace the pattern, then do the same for supplier A2 and so on.
In pandas, that can be achieved using split-apply-combine techinques like df.groupby.transform, but I am totally lost on how to do that using postgreSQL or Vertica SQL.
here what I have tried so far, using a self join on the Order Id columns to create the search space for each value in all the others, but the runtime is not feasible.
WITH cte AS (
SELECT
A."Suppliers" AS "Suppliers",
A."Order ID" AS "Order ID",
B."Order ID" AS "Order ID OK",
ROW_NUMBER() OVER (PARTITION BY A."Order ID") AS "DUPLICATE_ORDER_ID"
FROM table AS A
LEFT JOIN table AS B
ON A."Order ID" LIKE '%' || B."Order ID" || '%'
)
SELECT Suppliers, "Order ID", "Order ID OK" FROM cte
WHERE "DUPLICATE_ORDER_ID" = 1
Is there anyway I could apply this same code modifiying it to only apply the search logic by supplier and then concatenate the result? (The Order ID values are unique no matter the supplier)
Thank you!!!!!!
Is there a way to refer to columns created in the same SELECT
SELECT 2*a as a2,
3*a2 as a6
FROM ...
I am aware that I could use a nested query, but with many variables created after each other this seems tedious.
You can not refer to the alias in the same SELECT clause. It is SQL standard behavior, and it prevents ambiguities.
There are two options as I have mentioned in the comment also.
-- Use expression of the a2 in a6
SELECT 2*a as a2,
3*(2*a) as a6
FROM ...
-- OR use the sub-query
SELECT a2, 3*a2 as a6 FROM
(SELECT 2*a as a2
--3*a2 as a6
FROM ...)
I need to compare an attribute values of row 1 with row 2, row 2 with row 3, row 3 with row 4 and so on for the entire table. The table data set is small after applying a condition in the WHERE clause.
The comparison part of it involves joins to other tables (which I can manage).
ID PROJECT VALUE
B1 PRJ001 100
B2 PRJ001 200
B3 PRJ001 200
B4 PRJ001 300
.....
.....
B9 PRJ001 600
In the example above, I need to compare B1 to B2, B2 to B3 ... B8 to B9 and count the number of times the VALUES don't match.
Any help with this would be much appreciated.
Thank you.
You need Lead function
Select t.*,
LEAD(value,1) over (order by id) as next_value
From your_table t;
Also just to be clear, I hope this is just an example. Because if you add more id like 'B10' , then order by would put B10 between B1 and B2.
Table Code:
Col1
----
A1
A2
A3
B1
B2
C1
D1
D2
(I have other columns as well)
I am trying to create every possible combination EXCLUDING itself (i.e COL1:A1 COL2:A1) EXCLUDING havING it again the reverse way (i.e A1 A2, A2,A1)... They are to be in separate columns and there are other columns included as well. I am a newbie, go easy on me :)
So far I have:
SELECT
a.Col1, a.[differentcolumn],
b.Col1, b.[differentcolumn]
FROM
[dbo].code a
CROSS JOIN
[dbo].code b
WHERE
a.[col1] != b.[col1]
This is almost it but it gives me:
A1 A2
A2 A1
I only want it one way (The first one). How do I do this?
I'm not completely clear on your requirement, but do you just need this?
SELECT
a.Col1, a.[differentcolumn],
b.Col1, b.[differentcolumn]
FROM
[dbo].code a
INNER JOIN [dbo].code b ON a.[col1] < b.[col1]
This will join the table to itself on col1, but using < means that you won't see the values where the left-hand copy has a col1 greater than or equal to the right-hand copy, which seems to be what you want.
how to get the comma separated values stored in the Sql Db into a individual values
e.g in sql DB the column is stored with comma values as shown below,
EligibleGroup
A11,A12,A13
B11,B12,B13
I need to get
EligibleGroup
A11
A12
A13
B11
B12
...
I have written a query that will fetch me some list of employees with employee name and eligible group
XXX A11
YYY B11
ZZZ C11
I need to check that the employees(XXX,YYY,ZZZ) eligiblegroup falls within this
EligibleGroup
A11,A12,A13
B11,B12,B13
and return me only that rows.
use a "user defined function" like the one shown here (including source code) - it returns the splitted values as a "table" (one row per value) you can select from like
select txt_value from dbo.fn_ParseText2Table('A11,A12,A13')
returns
A11
A12
A13
You could use a subquery:
SELECT employee_name, eligible_group
FROM YourTable
WHERE eligible_group IN
(SELECT SPLIT(EligibleGroup)
FROM tblEligibleGroup
WHERE <some conditions here>)
I don't believe the "SPLIT" function exists in SQL Server so you'll have to either create a user defined function to handle that, or you could use the nifty workaround suggested here: How do I split a string so I can access item x?
I think you can do it this way,
select left('A11,A12,A13',3) + SUBSTRING('A11,A12,A13',charindex(',','A11,A12,A13'),10)
I think you may not have to split EligibleGroup. You can do another way by just:
select empId
from yourTempEmpTable t1, EligibleGroup t2
where t2.elibigle like '%'+t1.elibigle+'%'
I think it should work.
Assuming that EligibleGroup has a fixed length data, you can try using SUBSTRING As follows:
select substring(EligibleGroup,1,3) from #test union all
select substring(EligibleGroup,5,3) from #test union all
select substring(EligibleGroup,9,3) from #test
This will return:
A11
A12
A13
B11
B12
...
You can try it in Data Explorer
And If you need to check if an employee fall into which EligibleGroup try this:
Select EligibleGroup from test where eligibleGroup like '%A11'