I'm using pyodbc and postgres.
Can I alias multiple columns?
Here's the description of my problem:
Data structure:
data
id | c1 | c2
-------------
1 | 11 | 12
2 | 21 | 22
Notation: c is for column
dictionary
id | key | value
----------------
1 | k1 | v11
1 | k2 | v12
2 | k1 | v21
2 | k2 | v22
Notation: k is for key, v is for value
You can think of k1 and k2 as two more columns. The data structure is this way because it's constantly changing. I didn't design it, I just have to go with it.
I can't figure out an sql query to give me something like the following (most importantly, for some row, I can access k1 and k2 columns by some name):
data
id | c1 | c2 | k1 | k2
-------------------------
1 | 11 | 12 | v11 | v12
2 | 21 | 22 | v21 | v22
The problem I keep running into is if I alias the tables, then the sql result will contain two "key" columns from the dictionary table, meaning I can't control which column I access of the two, but if I alias the rows, then I can't control which tables are being referenced inside the sql statement.
The fix I'm thinking is to alias two columns:
SELECT * FROM data
FULL JOIN dictionary AS a1,a2,a3
ON data.id = a1
FULL JOIN dictionary AS a4,a5,a6
ON data.id = a4
WHERE a2 = k1 and a5 = k2
Notation: a is for alias
The result of this would theoretically look like
data
id | c1 | c2 | a3 | a6
-------------------------
1 | 11 | 12 | v11 | v12
2 | 21 | 22 | v21 | v22
Note all a's would technically be here, but 3 and 6 are the ones I'm interested in
You can alias the entire table, for example dictionary as d1. Then refer to the column names in that table as d1.col1. For example:
SELECT d.id
, d.c1
, d.c2
, d1.value as a3
, d2.value as a6
FROM data as d
LEFT JOIN
dictionary as d1
ON data.id = d1.id
and d1.key = 'k1'
LEFT JOIN
dictionary as d2
ON data.id = d2.id
and d2.key = 'k2'
Related
I have 3 tables A, B, C.
A is the main table
B & C have a many-to-one relation with A
Let's say the data is as follow:
Table A Table B Table C
| id | name | | id | a_id (FK) | value | | id | a_id (FK) | value |
| :---: | :---: | | :---: | :-------: | :---: | | :---: | :-------: | :---: |
| a0 | A0 | | b0 | a0 | B0 | | c0 | a0 | C0 |
| a1 | A1 | | b1 | a0 | B1 | | c1 | a0 | C1 |
| b2 | a1 | B2 |
I want to retrieve the records in A with their relations aggregated into an array or a json, like so:
id
name
b
c
a0
A0
[[b0, B0], [b1, B1]]
[[c0, C0], [c1, C1]]
a1
A1
[[b2, B2]]
[]
If I use the classic TypeORM .find():
const items = await aRepository.find({ relations: ["B", "C"] });
I will quickly run out of memory because TypeORM is simply making a query with LEFT JOINs which will return duplicated lines before aggregating them. There's an issue about it here.
This leaves me writing a custom SQL statement with nested SELECTs:
SELECT
A.id, A.name,
(
SELECT json_agg(bs)
FROM (
SELECT id, "value"
FROM B WHERE B.a_id = A.id
) bs
) AS b,
(
SELECT json_agg(cs)
FROM (
SELECT id, "value"
FROM C WHERE C.c_id = A.id
) cs
) AS c
FROM A
Without an index on B.a_id and C.a_id, it takes ~230ms with a 200 records in A and +/-10 records in B and C for each A record.
By creating an index on B.a_id and C.a_id, I can bring this down to 30ms.
Is there still a better way to do this?
I am trying to do fuzzy string matching to get as more matches as I can. First I execute the Levenshtein Distance Algorithm (http://www.kodyaz.com/articles/fuzzy-string-matching-using-levenshtein-distance-sql-server.aspx) and store it as "distance" in my dbo.
My first table (t1) looks like this:
Name | Synonym
A | A1
A | A2
A | A3
B | B1
B | B2
My second table (t2) looks like this:
The ID field may look like Name and Synonym very much
ID | Description
A | XXX
B | YYY
My goal is to make left joins either on the Name or its Synonyms when the distance between 2 strings from each table (t1 and t2) are smaller than 2.
Here is my current work:
SELECT *
FROM (
SELECT t2.ID, ROW_NUMBER() over (partition by ID order by ID) as rn
FROM table1 as t1
LEFT JOIN table2 as t2
ON (upper(trim(t1.Name)) = upper(trim(t2.ID)) OR upper(trim(t1.Synonym)) = upper(trim(t2.ID)))
WHERE (dbo.distance(t1.Name,t2.ID)<=2 OR dbo.distance(t1.Synonym,t2.ID)<=2)
) temp
WHERE rn=1
Ideally, as long as their distance is smaller than 2, we will still doing the join.
It should get more matches by adding that condition, however it doesn't.
Am I missing anything here?
I was wondering if my problem is coming from this:
My intention is to see if the conditions meet, if so then just do the join.
But my code here probably tells SQL to "join first", and the filter it afterwards.
Is there a way to let it see if the condition qualifies and then do the join "after"?
I have tried DIFFERENCE function just for demo purpose. It finds if two strings are similar and then returns 4 (lowest possible difference) and goes down to 0(lowest possible difference). You can try similar logic using your distance function.
DECLARE #table1 TABLE(Name varchar(10), synon varchar(10))
DECLARE #table2 TABLE(Name varchar(10), synon varchar(10))
INSERT INTO #table1
VALUES ('A','A1'),('A','A2'),('A','A3'),('B','B1'),('B','B2'),('B','B3')
INSERT INTO #table2
VALUES ('A','A1'),('A','A2'),('C','C1'),('B','B2'),('B','B3')
SELECT t1.name, t1.synon, t2.Name,t2.synon
FROM #table1 as t1
CROSS APPLY (SELECT T2.Name, t2.synon FROM #table2 as t2 WHERE DIFFERENCE(t2.Name,t1.Name) = 4 OR DIFFERENCE(t2.synon,t1.synon) = 4) as t2
+------+-------+------+-------+
| name | synon | Name | synon |
+------+-------+------+-------+
| A | A1 | A | A1 |
| A | A2 | A | A1 |
| A | A3 | A | A1 |
| A | A1 | A | A2 |
| A | A2 | A | A2 |
| A | A3 | A | A2 |
| B | B1 | B | B2 |
| B | B2 | B | B2 |
| B | B3 | B | B2 |
| B | B1 | B | B3 |
| B | B2 | B | B3 |
| B | B3 | B | B3 |
+------+-------+------+-------+
I have the following table (the data type of the column value is varchar, some values such as c2 and c4 are missing) :
__________________________
id | value
__________________________
1 | {{a1,b1,c1},{a2,b2,}}
__________________________
2 | {{a3,b3,c3},{a4,b4}}
__________________________
and I would like to obtain something like:
id | A | B | C
__________________
1 | a1 | b1 | c1
__________________
1 | a2 | b2 |
__________________
2 | a3 | b3 | c3
__________________
2 | a4 | b4 |
I am trying to use regexp_split_to_array, without any success so far.
How can this be achieved?
Thank you!
This assumes you know what the possible values are (e.g. a*, b*) because otherwise generating the appropriate columns for the value types will require dynamic sql.
Setup:
CREATE TABLE t (id INTEGER, value VARCHAR);
INSERT INTO t
VALUES
(1, '{{a1,b1,c1},{a2,b2,}}'),
(2, '{{a3,b3,c3},{a4,b4}}')
;
Query:
SELECT
id,
NULLIF(r[1], '') AS a,
NULLIF(r[2], '') AS b,
NULLIF(r[3], '') AS c
FROM (
SELECT id, regexp_split_to_array(r[1], ',') AS r
FROM (
SELECT id, regexp_matches(value, '{([^{][^}]+)}', 'g') AS r
FROM t
) x
) x;
Result:
| id | a | b | c |
| --- | --- | --- | --- |
| 1 | a1 | b1 | c1 |
| 1 | a2 | b2 | |
| 2 | a3 | b3 | c3 |
| 2 | a4 | b4 | |
Note that if it's possible for earlier values to be missing, e.g. {b1,c1} where a1 is missing, then the query would have to be different.
You can use string_to_array to convert string to array and then explode it in multiple rows with unnest:
EXAMPLE
SELECT unnest(string_to_array('{1 2 3},{4 5 6},{7 8 9}', ','));
{1 2 3}
{4 5 6}
{7 8 9}
I have 2 tables, "table1" have 1 column to store "table2" column name. Table1 data as below:
ID | Desc | Table2ColName | Active
-------------------------------------
1 | 1 Day | D1 | Yes
2 | 2 Days | D2 | No
3 | 3 Days | D3 | Yes
Table2 data as below:
ID | ShopName | D1 | D2 | D3
----------------------------------
1 | Sp1 | 100 | 80 | 120
Then I want to join 2 table and just display the Active data, How do I using linq to query the result as below:
ID | ShopName | D1 | D3
---------------------------
1 | Sp1 | 100 | 120
I have try whole day but get noting, hope can help. Thanks
I assume you already got the answer you needed for this, but I figured id post it anyway. Your query should look something like this.
var results = from a in data.table1
join b in data.table2
on a.ID equals b.ID
where a.Active =='Yes'
select new
{
a.ID,
b.ShopName,
b.D1,
b.D2
};
Im trying to use either CONTAINS or FREETEXT in SQL programing to be able to search couple words all at the same time. The issue is when you search couple words FREETEXT or CONTAINS will search by column not by row for example:
Imagine this is my database
id | c1 | c2
===============================
a | 1 | 2
b | 1 | 3
c | 1 | 2
d | 2 | 2
e | 3 | 3
f | 2 | 1
When you search CONTAINS(*, '1,2') OR off curse FREETEXT(*, '1 2') it will return
id | c1 | c2
===============================
a | 1 | 2
b | 1 | 3
c | 1 | 1
d | 2 | 2
e | 2 | 3
f | 2 | 1
Which is basically entire database. But what I wanted was this
id | c1 | c2
===============================
a | 1 | 2
f | 2 | 1
Which is the rows that contains 1 and 2 combine.
By the way I'm using SQL 2008 and ASP classic.
I would really appreciate for your suggestion.
I have had this problem often. The solution I have used has never really satisfied me performance wise, but it does work. It is CONCATENATING the values I am searching for.
SELECT *
FROM (
SELECT (T.C1 + '-' + T.C2) AS C3, *
FROM TABLE T
WHERE T.C1 IN (ALL VALUES FOR C1)
) T2
WHERE T2.C3 IN (ALL CONCATENATED VALUES)
IF TYPES ARE NUMERIC, YOU SHOULD USE CONVERT OR CAST TO VARCHAR.
Hope this helps.
EDIT: I got an error using contains for lack of indexing. Change to IN.