How to generate hierarchical data in mockaroo? - data-generation

I want to generate some mock data in mockaroo.
The format of the data should be like this
Claim ID Claim subid
1 1
1 2
1 3
2 1
2 2
2 3
3 1
So, basically the claimdid column can have multiple subids. Is this possible?

Related

How to indicate count of values in categorical column in Pandas, Python?

I have the following Pandas DataFrame:
ID CAT
1 A
1 B
1 A
2 A
2 B
2 A
1 B
1 A
I'd like to have a table that indicates the number of occurance per CAT values for each ID in different columns like this:
ID CAT_A_NUM CAT_B_NUM
1 3 2
2 2 1
I tried in many ways, like this one with pivot table, but unsuccessfully:
df.pivot_table(values='CAT', index='ID', columns='CAT', aggfunc='count')
you can use crosstab():
df=pd.DataFrame(data={'ID':[1,1,1,2,2,2,1,1],'CAT':['A','B','A','A','B','A','B','A']})
final = pd.crosstab(df['ID'], df['CAT'])
final.columns=['CAT_A_NUM','CAT_B_NUM']
final
ID CAT_A_NUM CAT_B_NUM
1 3 2
2 2 1
Probably you can use groupby + unstack
df.groupby(["ID","CAT"]).size().unstack()
which gives
CAT A B
ID
1 3 2
2 2 1

Check constraint for multiple conditions

The teacher gave us a team assignment, and me and my teammate are quite struggling with it (especially since we need to use things like TRIGGERS and PROCEDURES, things we didn't see in class yet …).
We need to implement an arc-relationship, and we fail to understand how …
But before I tell you guys what I need to accomplish, I will give you part of the description of the task, so you guys can understand the situation a bit better …
We basically need to make an ERD for a VLSI CAD-system and we need to implement it. Now, we have our CELL entity, the attributes of which aren't really relevant … The only thing you guys need to know in order to help us is that it has a primary key, CELL_CODE, which is a VARCHAR.
Each CELL has many (I think at least four, I don't think you can have triangular CELLS, but doesn't matter anyways) SIDES. A SIDE can be logically identified by its CELL, and to make matters ridiculously difficult, each SIDE has to be numbered by its CELL, like so:
CELLS:
CELL_CODE
1
2
SIDES:
SEQUENCE_NUMBER CELL_CODE
1 1
2 1
3 1
1 2
2 2
3 2
Now, each SIDE has its CONNECTION_PINS. CONNECTION_PINS is also uniquely identified by SIDES, which are basically numbered in a similar manner:
CELLS:
CELL_CODE
1
2
SIDES:
SEQUENCE_NUMBER CELL_CODE
1 1
2 1
3 1
1 2
2 2
3 2
CONNECTION_PINS:
SEQUENCE_NUMBER SIE_SEQUENCE_NUMBER CELL_CODE
1 1 1
2 1 1
1 2 1
2 2 1
1 3 1
2 3 1
1 1 2
2 1 2
1 2 2
2 2 2
1 3 2
2 3 2
I tried to explain the numbering issue we have here: Data model - PRIMARY KEY numbering issue, but yeah, I didn't really explain it the way it should be explained ...
Now, we have one final entity, which is where the Arc comes in: CONNECTIONS. CONNECTIONS has 2 CONNECTION_PINS: one for START_FROMand one for END_OF. Now, logically seen the start pin can't be the end pin as well, for a given connection. And that's our struggle. Basically, this shouldn't be allowed:
CELLS:
CELL_CODE
1
2
SIDES:
SEQUENCE_NUMBER CELL_CODE
1 1
2 1
3 1
1 2
2 2
3 2
CONNECTION_PINS:
SEQUENCE_NUMBER SIE_SEQUENCE_NUMBER CELL_CODE
1 1 1
2 1 1
1 2 1
2 2 1
1 3 1
2 3 1
1 1 2
2 1 2
1 2 2
2 2 2
1 3 2
2 3 2
CONNECTIONS:
(you shouldn't be able to put this in …)
CPI_SEQNUM_START SIE_SEQNUM_START CELL_CODE_START CPI_SEQNUM_END SIE_SEQNUM_END CELL_CODE_END
1 1 1 1 1 1
Now, this is basically the ERD for this part:
ERD with barred relationships and the arc-relationship in question
and this is the physical model:
Physical model
I basically thought a simple CHECK might do (CHECK (CPI_SEQNUM_START <> CPI_SEQNUM_END AND CELL_CODE_START <> CELL_CODE_END AND SIE_SEQNUM_START <> SIE_SEQNUM_END) ), but that prevented us from inserting anything somehow … Any advice?
Your approach was correct to use a CHECK constraint. Your logic for the constraint was wrong though. You need an OR condition. Only one of the three fields needs to be different.
CPI_SEQNUM_START <> CPI_SEQNUM_END OR
CELL_CODE_START <> CELL_CODE_END OR
SIE_SEQNUM_START <> SIE_SEQNUM
... assuming all three fields are not nullable.

SQL table structure for store value against list of combination

I have a requirement from client where I need to store a value against list of combination.
For example I have following LOBs and against each combination I need to store a value.
Auto
WC
Personal
I purposed multiple solutions he is not satisfied with anyone.
Solution 1: create single table, insert value against all possible combination(string) something like
LOB Value
Auto 1
WC 2
Personal 3
Auto,WC 4
Auto, personal 5
WC, Personal 6
Auto, WC, Personal 7
Solution 2: create lkp_lob, lob_group and lob_group_detail tables. Each group combination represent a group.
Lkp_lob
Lob_key Name
1 Auto
2 WC
3 Person
Lob_group (unique query constrain on lob_group_key and lob_key)
Lob_group_key Lob_key
1 1
2 2
3 3
4 1
4 2
5 1
5 3
6 2
6 3
7 1
7 2
7 3
Lob_group_detail
Lob_group_key Value
1 1
2 2
3 3
4 4
5 5
6 6
7 7
Any suggestion would be highly appreciated.
First of all I did not understood that terms you said.
But from database perspective it is always good to have multiple tables for each module. You will be facing less difficulties when doing CRUD. And will be more faster.

SQL query to pull ONLY data which contains range of records

I need your help to write query with select statement to pull only claim IDs which contains code from this range: 99213, 99214, 99215, 99217.
So my results should be claim ID 1 (all lines) and claim ID 3 (all lines). Since claim ID 2 has codes which are outside of the range, i do not want that in my results.
Claim id line # code
1 1 99213
1 2 99214
1 3 99215
1 4 99217
2 1 99213
2 2 89557
2 3 36415
3 1 99215
3 2 99217
Result should be like this
Claim id line # code
1 1 99213
1 2 99214
1 3 99215
1 4 99217
3 1 99215
3 2 99217
Use a subquery to isolate ClaimIDs that have Codes outside of your list of values. Then rule them out of the main query with a not in.
SELECT *
FROM Table
WHERE ClaimID NOT IN (
SELECT ClaimID FROM Table WHERE Code NOT IN (99213,99214,99215,99217)
);

reorder sort_order in table with sqlite

I have this table:
id sort_ord
0 6
1 7
2 2
3 3
4 4
5 5
6 8
Why does this query:
UPDATE table
SET sort_ord=(
SELECT count(*)
FROM table AS pq
WHERE sort_ord<table.sort_ord
ORDER BY sort_ord
)
WHERE(sort_ord>=0)
Produce:
id sort_ord
0 4
1 5
2 0
3 1
4 2
5 4
6 6
I was expecting all sort_ord fields to subtract by 2.
Here is defined: https://www.sqlite.org/isolation.html
About this link i can interpret, you has several instances for one query (update table and select count table) and independent of each other.
When you are in update sort_data(5) id 5, you have new data for read on every "SET sot_ord" (understanding what say about isolation), and now the result is 4.
Every select is a new instance and a new data reading
id sort_ord
0 4
1 5
2 0
3 1
4 2
5 5**
6 8**