Is there a special name for a multi dimensional table that concatenates the inputs? - data-visualization

Hello I have been using tables like this a lot:
A
B
C
AC
BC
D
AD
BD
E
AE
BE
Does this have a name?
Is there another name for it if the axis input is the same? X's for duplicates.
A
B
C
A
AA
AB
AC
B
X
BB
BC
C
X
X
CC
We have been using these for a project quite a bit and no one is sure how to refer to them. I figured these are so common they must have a name but when I search for them I find nothing.
Some more examples:

Related

PostgreSQL data transformation - Turn rows into columns

I have a table whose structure looks like the following:
k | i | p | v
Notice that the key (k) is not unique, there are no keys, nothing. Each key can have multiple attributes (i = 0, 1, 2, ...) which can be of different types (p) and have different values (v). One attribute type may also appear multiple times (p(i-1) = p(i)).
What I want to do is pick certain attribute types and their corresponding values and place them in the same row. For example I want to have:
k | attr_name1 | attr_name2
I have managed to make a query that does this and works for all keys (k) for which attr_name1 and attr_name2 appear in the column p of the initial table:
SELECT DISTINCT ON (key) fn.k AS key, fn.v AS attr_name1, a.v AS attr_name2
FROM Table fn
LEFT JOIN Table a ON fn.k = a.k
AND a.p = 'attr_name2'
WHERE fn.p = 'attr_name1'
I would like, however, to take into account the case where a certain key has no attribute named attr_name1 and insert a NULL value into the corresponding column of the new table. I am not sure how to achieve that. I have no issue using multiple queries or intermediate tables etc, but there are quite a lot of rows in the table and I need something that scales to millions of rows.
Any help would be appreciated.
Example:
k i p v
1 0 a 10
1 1 b 12
1 2 c 34
1 3 d 44
1 4 e 09
2 0 a 11
2 1 b 13
2 2 d 22
2 3 f 34
Would turn into (assuming I am only interested in columns a, b, c):
k a b c
1 10 12 34
2 11 13 NULL
I would use conditional aggregation. That is, an aggregate function around a CASE expression.
SELECT
k,
MAX(CASE WHEN p='a' THEN v END) AS a,
MAX(CASE WHEN p='b' THEN v END) AS b,
MAX(CASE WHEN p='c' THEN v END) AS c
FROM
your_table
GROUP BY
k
This presumes that (k, p) is unique. If there are duplicate keys, this will clearly find the one v with the highest value (for each (k,p))
As a general rule this kind of pivoting makes the data harder to process in SQL. This is often done for display purposes because humans find this easier to read. However, from a software engineering perspective, such formatting should not be done in the data layer; be careful that by doing this you don't actually make your future life harder.

Odd Even Sorting in VBA

I am trying to sort rows of data so that the integer value of an alpha-numerical address is in order of odd values then even values given they are of the same type.
The only way I have got it to (semi)work was this:
-Find if the integer of the address is even or odd
-Add EVEN or ODD to a cell in that addresses corresponding row
-Run the macro
-Filter the data by EVEN or ODD designation
This approach isn't ideal. I am interested in rearranging the rows without having to use filtering.
Below is an example of how the sorting would go.
UNSORTED SORTED
Address Type Address Type
1.1p A 1.1p A
1.2p A 1.2p A
1.3p A 1.3p A
1.4p A 1.4p A
2.1p A 3.1p A
2.2p A 3.2p A
2.3p A 3.3p A
2.4p A 3.4p A
3.1p A 5.1p A
3.2p A 5.2p A
3.3p A 5.3p A
3.4p A 5.4p A
4.1p A 2.1p A
4.2p A 2.2p A
4.3p A 2.3p A
4.4p A 2.4p A
5.1p A 4.1p A
5.2p A 4.2p A
5.3p A 4.3p A
5.4p A 4.4p A
6.1p B 7.1p B
6.2p B 7.2p B
6.3p B 7.3p B
6.4p B 7.4p B
7.1p B 9.1p B
7.2p B 9.2p B
7.3p B 9.3p B
7.4p B 9.4p B
8.1p B 6.1p B
8.2p B 6.2p B
8.3p B 6.3p B
8.4p B 6.4p B
9.1p B 8.1p B
9.2p B 8.2p B
9.3p B 8.3p B
9.4p B 8.4p B
10.1p B 10.1p B
10.2p B 10.2p B
10.3p B 10.3p B
10.4p B 10.4p B
I am new to VBA. Thank you in advance for any suggestions.
I think you need to create a helper column where you can store a value that you can use for sorting.
Basic idea is to extract the numeric value from your "Adress" column, check if it is even and if yes multiply it by an high value (eg 1000) so that it is guaranteed to be higher than the highest possible odd value.
You can use either a formula for this cell - but it's looking a little complicated to me. Assuming that your data starts in cell A2:
=VALUE(LEFT(A2, SEARCH("p", A2, 1)-1))*IF(ISODD(VALUE(LEFT(A2, SEARCH("p", A2, 1)-1))),1,1000)
or write a small UDF
Function SortVal(s As String) As Double
SortVal = Val(s)
If Int(SortVal) Mod 2 = 0 Then SortVal = SortVal * 1000
End Function
and put a call to it in your helper column
=SortVal(A2)

Compare multiple values based on cell Value

I have a 3 datasets.
Master dataset have
A B C D
11 T Jim India
12 U Mary UK
13 V Bob US
14 P Peter India
India dataset
A B H K
10 11 T Jim
10 13 0 Krestel
10 14 P Peter
10 15 L Robert
If the D coulmn had India then the details of columns A, B and C should match that in India dataset with coulmn B, H and K respectively. (The combination of the column A, B and C should present in Dataset- India, If not hoghlighted or add comment in last column of master dataset)
I have been doing this manually by adding several helper columns in all the datasets using concatenation and then using vlookup.
Is it possible to automate this process using vba?
Any help will be appreciated.
Actually, I think that you can achieve this through spreadsheet functions alone, without the need of VBA. Check the usage of the function VLOOKUP.
The idea would be to deploy a formula in, say, column "E" of the Master dataset that would check for an entry in the relevant country dataset matching the values of A, B and C. You will need to build the reference to the range VLOOKUP uses taking into account the country name.
Hope this serves you as a good guide.

Split a string to its subwords

Every letter has a value
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
TableA
String Length Value Subwords
exampledomain 13 132 #example-domain#example-do-main#
creditcard 10 85 #credit-card#credit-car-d#
TableB
Words Length Value
example 7 76
do 2 19
main 4 37
domain 6 56
credit 6 59
card 4 26
car 3 22
d 1 4
Explanation
TableA has string based over milion rows, and it will be new added 100k rows/daily to tableA.
And also "string" column has no whitespaces
TableB has words based over milion rows,there is every letter and words in 1-2 languages
What i want to do
i want to split strings in TableA to its subwords, as you see in example; "creditcard" i search in TableB all words and try to find which words when comes together matches the string
What i did,and couldnt solve my question
i took the string and JOIN the TableB with INNER JOINS i made 2-3 times INNER JOINS because there can be 3word 4word strings too, and that WORKED!! but it takes too much time even doing it for 100-200 strings. Guess i want to do it for 100k/everyday???
Now what i try to do
i gave values to everyletter as you see above,
Took the strings one by one and from their including letters i count the value of strings..
And the same for the words too in TableB..
Now i have every string in TableA and everyword in TableB with their VALUES..
_
1- i will take the string,length and value of it (Exmple; creditcard - 10 - 85)
2- and make a search in TableB to find the possible words when they come together, with their SUM(length), and SUM(value) matches the strings length and value, and write theese possibilities to a new column.
At last even their sum of length and sum of values matches each other there can be some posibilities that doesnt match the whole string i will elliminate theese ones (Example; "doma-in" can be "moda-in" too and their lengths and values are same but not same words)
I dont know but,i guess with that value method i can solve the time proplem??? , or if there is another ways to do that, i will be gratefull taking your advices.
Thanks
You could try to find the solutions recursively by looking always at the next letter. For example for the word DOMAIN
D - no
DO - is a word!
M - no
MA - no
MAI - no
MAIN - is a word!
No more letters --> DO + MAIN
DOM - is a word!
A - no
AI - no
AIN - no
Finished without result
DOMA - no
DOMAI - no
DOMAIN - is a word!
No more letters --> DOMAIN

Split content of a column and get the other replicated

I have a file (too large) with a structure like this
A B C,D,E,F
The third column contains 4 values (but could be variable) separated with commas. I would like to convert that file into
A B C
A B D
A B E
A B F
Basically replicating the first two and splitting the second into rows.
Any idea on how to do that in awk?
$ awk '{n=split($3,a,/,/);for(i=1;i<=n;i++)print $1,$2,a[i]}' file
A B C
A B D
A B E
A B F