Hi Guys I'm new here and can really do with your help writing a SQL script/function for the following problem.
I have a source table which contains three columns Name, Value, miNum. Example of the data inside this table is:
Name Value miNum
A+B+C 1+2+3 a1
C+D+E 3+4+5 a3
E+F 5+2 a7
Now, I have created a final_table and the columns of that table are same as the source table but with additional columns labelled a-z (29 columns in total).
What I want the script/Function to do is from the source table read each row and populate the corresponding column in final_table.
Example output of final_table
Name Value miNum A B C D E F
A+B+C 1+2+3 a1 1 2 3
C+D+E 3+4+5 a3 3 4 5
E+F 5+2 a7 5 2
new columns will be regularly added to the final_table so it won't make sense to hard code the columns into the SQL code. Is it possible to do all this without hardcoding column names??
please can someone kindly show me how I can achieve all this.
Thanks
Please add rest of the columns based on this schema:
select tst.*,
case when instr(name,'A') > 0 then substr(Value,instr(name,'A'),1) end A,
case when instr(name,'B') > 0 then substr(Value,instr(name,'B'),1) end B,
case when instr(name,'C') > 0 then substr(Value,instr(name,'C'),1) end C,
case when instr(name,'D') > 0 then substr(Value,instr(name,'D'),1) end D,
case when instr(name,'E') > 0 then substr(Value,instr(name,'E'),1) end E,
case when instr(name,'F') > 0 then substr(Value,instr(name,'F'),1) end F
from tst;
Gives result
NAME VALUE MINUM A B C D E F
------ ------ ------ - - - - - -
A+B+C 1+2+3 a1 1 2 3
C+D+E 3+4+5 a3 3 4 5
E+F 5+2 a7 5 2
Note that this approach works only for unique names, i.e. for duplicated names only the first value is shown, e.g.
NAME VALUE MINUM A B C D E F
------ ------ ------ - - - - - -
A+A+A 5+2+1 a11 5
The other restriction is that the keys are only A-Z and the values are single character, if this holds you may use this insertto populate the target table:
Insert into TARGET
(name, value, miNum,A,B,C,D,E,F)
select name, value, miNum,
case when instr(name,'A') > 0 then substr(Value,instr(name,'A'),1) end A,
case when instr(name,'B') > 0 then substr(Value,instr(name,'B'),1) end B,
case when instr(name,'C') > 0 then substr(Value,instr(name,'C'),1) end C,
case when instr(name,'D') > 0 then substr(Value,instr(name,'D'),1) end D,
case when instr(name,'E') > 0 then substr(Value,instr(name,'E'),1) end E,
case when instr(name,'F') > 0 then substr(Value,instr(name,'F'),1) end F
from source;
You will have to extend both insert column list and the query with the column up to Z.
Related
I am transposing a data frame where I do not have defined column names and then need to drop rows from the transposed table where a given rows value in the first column (index 0) starts with ‘zrx’. I am thinking something like this should work, but can’t seem to get it working:
df[~df[0].str.startswitg("zrx")]
Input data looks like this (no headers):
Index 0 Index 1
zrx456. True
zrx567 false
abc234 True
Gfh123 False
nbv345 True
zrx456 False
zrx668 True
zrx789 True
My goal is to return this data frame with only the rows that start with zrx in column 0.
If you know the name of the first column, use
df[~df.Artist.str.startswith('zrx')]
If you do not know the name of the first column, use
df[~df.iloc[:,0].str.startswith('zrx')]
input
Artist Album Point
0 zrxAC1 A 1
1 AC2 B 2
2 zrxAC1 NaN 3
3 AC4 A 4
4 AC5 C 5
Output
Artist Album Point
1 AC2 B 2
3 AC4 A 4
4 AC5 C 5
I have a table whose structure looks like the following:
k | i | p | v
Notice that the key (k) is not unique, there are no keys, nothing. Each key can have multiple attributes (i = 0, 1, 2, ...) which can be of different types (p) and have different values (v). One attribute type may also appear multiple times (p(i-1) = p(i)).
What I want to do is pick certain attribute types and their corresponding values and place them in the same row. For example I want to have:
k | attr_name1 | attr_name2
I have managed to make a query that does this and works for all keys (k) for which attr_name1 and attr_name2 appear in the column p of the initial table:
SELECT DISTINCT ON (key) fn.k AS key, fn.v AS attr_name1, a.v AS attr_name2
FROM Table fn
LEFT JOIN Table a ON fn.k = a.k
AND a.p = 'attr_name2'
WHERE fn.p = 'attr_name1'
I would like, however, to take into account the case where a certain key has no attribute named attr_name1 and insert a NULL value into the corresponding column of the new table. I am not sure how to achieve that. I have no issue using multiple queries or intermediate tables etc, but there are quite a lot of rows in the table and I need something that scales to millions of rows.
Any help would be appreciated.
Example:
k i p v
1 0 a 10
1 1 b 12
1 2 c 34
1 3 d 44
1 4 e 09
2 0 a 11
2 1 b 13
2 2 d 22
2 3 f 34
Would turn into (assuming I am only interested in columns a, b, c):
k a b c
1 10 12 34
2 11 13 NULL
I would use conditional aggregation. That is, an aggregate function around a CASE expression.
SELECT
k,
MAX(CASE WHEN p='a' THEN v END) AS a,
MAX(CASE WHEN p='b' THEN v END) AS b,
MAX(CASE WHEN p='c' THEN v END) AS c
FROM
your_table
GROUP BY
k
This presumes that (k, p) is unique. If there are duplicate keys, this will clearly find the one v with the highest value (for each (k,p))
As a general rule this kind of pivoting makes the data harder to process in SQL. This is often done for display purposes because humans find this easier to read. However, from a software engineering perspective, such formatting should not be done in the data layer; be careful that by doing this you don't actually make your future life harder.
I am trying to sort rows of data so that the integer value of an alpha-numerical address is in order of odd values then even values given they are of the same type.
The only way I have got it to (semi)work was this:
-Find if the integer of the address is even or odd
-Add EVEN or ODD to a cell in that addresses corresponding row
-Run the macro
-Filter the data by EVEN or ODD designation
This approach isn't ideal. I am interested in rearranging the rows without having to use filtering.
Below is an example of how the sorting would go.
UNSORTED SORTED
Address Type Address Type
1.1p A 1.1p A
1.2p A 1.2p A
1.3p A 1.3p A
1.4p A 1.4p A
2.1p A 3.1p A
2.2p A 3.2p A
2.3p A 3.3p A
2.4p A 3.4p A
3.1p A 5.1p A
3.2p A 5.2p A
3.3p A 5.3p A
3.4p A 5.4p A
4.1p A 2.1p A
4.2p A 2.2p A
4.3p A 2.3p A
4.4p A 2.4p A
5.1p A 4.1p A
5.2p A 4.2p A
5.3p A 4.3p A
5.4p A 4.4p A
6.1p B 7.1p B
6.2p B 7.2p B
6.3p B 7.3p B
6.4p B 7.4p B
7.1p B 9.1p B
7.2p B 9.2p B
7.3p B 9.3p B
7.4p B 9.4p B
8.1p B 6.1p B
8.2p B 6.2p B
8.3p B 6.3p B
8.4p B 6.4p B
9.1p B 8.1p B
9.2p B 8.2p B
9.3p B 8.3p B
9.4p B 8.4p B
10.1p B 10.1p B
10.2p B 10.2p B
10.3p B 10.3p B
10.4p B 10.4p B
I am new to VBA. Thank you in advance for any suggestions.
I think you need to create a helper column where you can store a value that you can use for sorting.
Basic idea is to extract the numeric value from your "Adress" column, check if it is even and if yes multiply it by an high value (eg 1000) so that it is guaranteed to be higher than the highest possible odd value.
You can use either a formula for this cell - but it's looking a little complicated to me. Assuming that your data starts in cell A2:
=VALUE(LEFT(A2, SEARCH("p", A2, 1)-1))*IF(ISODD(VALUE(LEFT(A2, SEARCH("p", A2, 1)-1))),1,1000)
or write a small UDF
Function SortVal(s As String) As Double
SortVal = Val(s)
If Int(SortVal) Mod 2 = 0 Then SortVal = SortVal * 1000
End Function
and put a call to it in your helper column
=SortVal(A2)
MSSQL: i have this example data:
NAME AValue BValue
A 1 11
B 1 11
C 2 11
D 2 21
E 3 21
F 3 21
G 4 31
H 4 31
I 5 41
J 5 NULL
...
I am looking for algorhitm which looks for all the Names closed by values by different seed (AValue and Bvalue, in this case seed is given by 2 for AValue and by 3 for Bvalue, but this can be skipped and given later and so on, not only looking for smallest multiple). In this case output should be 1,2,3,4,11,21,31 as a first group/result. Then all the Names with these values can be updated etc.
I need to find out all the Names in "closed circle" of values by different seed.
EDIT:
(try of simplier example)
Imagine that you have list of names. Each name is given two numbers. In most cases these numbers are given by some seed (in this example AValue is given twice, BValue three times) but some numbers can be skipped, so you cannot just count smallest multiple of these different seeds(in this case it would be 2x3, ever 6 names you have closed group where no Name contains AValue or BValue from next/different group). For example Name A have 1 and 11. 1 is given for A and B, 11 for A, B, C. These Names have 1,2,11,21. So you check for 2 and 21 and then you get E and F in addition and then the loop of checking should continue, but as long as no more Names are contained there should be output 1,2,3,11,21. "Closed circle"
I'll clarify this: I have a data result with the twist that the two PK's (A and B) are the same, and field C doesn't.
Example:
A B C D
> 14 20 1 null
> 14 20 2 1
> 15 20 2 0
As you can see, D field has a null and a 0.
What I have to do is to change D's null value to 1 whenever A fields are the same, and there's more than 1 record with those, not touching the 0's in D.
I tried initially with NVLs and DECODEs, like this:
DECODE(migr.A,NULL,(NVL(C,1)),D) AS D
but I'm not getting all the records, only the D-1's.
I really don't want to relate to an extra table/step for validation, as my query result can be easily over 1 million records, but if that's the best, I'm ok.
Many thanks.