SQLSERVER - Calculate nodes loops in Network Chart - sql

I have a table that contains all the "Edges" between "Nodes" with relation "from node to node".
Table Nodes looks like this:
CREATE TABLE Nodes(
Id VARCHAR(10) PRIMARY KEY,
Name VARCHAR(50)
)
CREATE TABLE Edges(
Id VARCHAR(100) PRIMARY KEY,
FromId VARCHAR(10),
ToId VARCHAR(10)
)
These tables are used to describe a network chart like the one rapresented in the image.
I have to create a recursive query in order to get all nodes the paths that pass throug "FROM" nodes
and and arrive to "TO" nodes.
I have also to avoid infinite loop like the one highlighted in yellow rectangle.
So basically the spider have to check for all the nodes the available paths, track the nodes that
it has already passed and then go ahead till It reach "TO" nodes.
Then in a second moment, once I have all the routes, I need to test what are the routes thas has
arrived to destination passing throug the FROM path.
How can i do this by Query ?
Thanks to support

Related

Need to create a Run id for each package run in SSIS Package

I have an SSIS Package that runs a query and inserts values into a different table. Each time the package runs, I want to create a unique RunID for the results of that run. Here are the columns from my table. I have tried this using the Execute SQL Task and setting up the User::RunID variable but, I believe I am doing something wrong. Can anyone provide step by step instructions on how to do this?
You need 2 tables for this.
create table runs(
runID int identity primary key,
runDateTime datetime default getdate()
)
create table runReturns(
runReturnsID int identity primary key,
runID int not null,
[the rest of the data set]
)
In ssis, start with an execute SQL.
Add this query...
insert into runs (runDateTime) values(?);
select SCOPE_IDENTITY()
Map the parameter (?) to Now();
Change the result set to single row and map the first column to a parameter called runID.
Now create a data flow.
Insert your query into a sql source.
Add a derived column and map a new column to runID.
Finally, add a destination to your table and map accordingly.
Adding a completely sql answer to compliment as an alternative since there are no transformations at all:
Same 2 tables:
create table runs(
runID int identity primary key,
runDateTime datetime default getdate()
)
create table runReturns(
runReturnsID int identity primary key,
runID int not null,
[the rest of the data set]
)
Create a Job.
Add a step and base it on SQL.
declare #runID int;
insert into runs(runDateTime) values(getdate());
select #runID = scope_idenity();
insert into runReturns(
runID, [rest of your columns])
select #runID
, [rest of your columns]
from [rest of your query]
An approach that might solve the issue, is the system scoped variable ServerExecutionID By default, System scoped variables are hidden in the Variables menu but you can expose them by clicking the Grid options button (rightmost of the 5).
If you reference that variable using the appropriate placeholder (? for OLE/ODBC or a named parameter for ADO) and map to the variable, then every server execution will have a monotonically increasing number associated to it. Runs from Visual Studio or outside of the SSISDB, will always have a value of 0 associated to them but given that this is only encountered during development, this might address the issue.
Sample query based on the newer picture
INSERT INTO dbo.RunMapTable
SELECT ? AS RunID
, D.Name
FROM
(
VALUES ('A')
, ('B')
, ('C')
, ('D')
)D([name];
Parameter Mapping
0 -> System::ServerExecutionID
As an added bonus, you can then tie your custom logging back to the native logging in the SSISDB.

How to insert data into a relation table correctly in SQL?

I have some data in a general table called ImportH. The data has been imposted from a csv file. I have also created two tables, Media and Host (each one has it's respective ID. These tables are related by a third table called HostMedia.
Each Host can have (or not) different types of Media (facebook, email, phone...).
I'll provide some images of the tables:
Table ImportH
Table Host
Table Media
How can I insert the data from the other tables into table HostMedia? This table looks like this:
create table HostMedia (
host_id int references Host (host_id),
id_media int references Media (id_verification),
primary key (host_id, id_media)
);
I have tried this:
insert into HostMedia (host_id, id_media)
select Host.host_id, Media.id_verification
from Host, Media;
But this does the cartesian product for all the hosts assigning them all the rows on the Media table. What's the correct way?
The "media" column in your "ImportH" table looks almost like a valid JSON, so this might work:
INSERT INTO HostMedia (host_id, id_media)
SELECT i.host_id, m.id_verification
FROM (
SELECT host_id,
json_array_elements_text(replace(media,'''','"')::json) AS media_name
FROM ImportH
) AS i
JOIN Media AS m ON m.media = i.media_name;
Notes: it would be easier if you
provided text data instead of screenshots
used logical column names

Database design for a template based evaluation system

We are working on a database to store some evaluations we conduct. There are a few different types of evaluations and some have changed over time. Because of this we need to keep a record of exactly what an evaluation looked like when it was undertaken.
I figured that the best way to support this would be through a template style system.
With:
A table saving all possible options;
A table mapping options to a template;
An evaluations table mapping a participant to a template on a date/time; and
A table mapping evaluator comments to an option of an evaluation.
This is a skeleton for the design:
CREATE TABLE options (
id SERIAL PRIMARY KEY,
option TEXT NOT NULL
);
CREATE TABLE templates (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE template_options (
template INTEGER NOT NULL REFERENCES templates( id ),
option INTEGER NOT NULL REFERENCES options( id ),
UNIQUE ( template, option )
);
CREATE TABLE participants (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE evaluations (
id SERIAL PRIMARY KEY,
template INTEGER NOT NULL REFERENCES templates( id ),
participant INTEGER NOT NULL REFERENCES participants( id ),
date TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE TABLE evaluation_data (
template INTEGER NOT NULL REFERENCES templates( id ),
option INTEGER NOT NULL REFERENCES options( id ),
evaluator_comments TEXT NOT NULL,
);
The design is able to capture our data but doesn't restrict the options saved in evaluation_data to the subset specified in the evaluation's template's option mapping. We could probably enforce it with a trigger (we can definitely do it with application logic [we are doing so at the moment]) but are we going down the wrong path with this design?
Can anybody think of a better way to do it?
Edit:
Added an example of a potential trigger we would need to use to ensure valid options are enforced with this design.
CREATE FUNCTION valid_option() RETURNS trigger as $valid_option$
BEGIN
IF NOT NEW.option IN ( SELECT template_options.option
FROM template_options
INNER JOIN templates
ON template_options.template = templates.id
WHERE templates.id = ( SELECT evaluations.template
FROM evaluations
WHERE evaluations.id = NEW.evaluation ) ) THEN
RAISE EXCEPTION 'This option is not mapped for this evaluations template.';
END IF;
RETURN NEW;
END
$valid_option$ LANGUAGE plpgsql;
CREATE TRIGGER valid_option BEFORE INSERT ON evaluation_data FOR EACH ROW EXECUTE PROCEDURE valid_option();
Remember that you need two sets of tables. The first set containing the assessment, questions, answer alternatives, categories(?) needed to display the assessment to the participant. The second set of tables to record data about the evaluation (ie. the participant taking the assessment): which assessment, which questions, which answer alternatives and in which order they were presented, which answer they entered (are they allowed to answer the same question multiple times?), etc.
We're using the following structure (I've removed topic scoring since you didn't ask about it):
Models for presenting an assessment:
Assessment: assessment_name, passing_status, version
Question: assessment, question_number, question_type, question_text
AnswerAlternative: question, correct?, answer_text, points
Models for recording an evaluation (participant taking an assessment):
Progress: started_timestamp, finished_timestamp, last_activity, status (includes "finished")
Result: user, assessment, progress, currently_active, score, passing_grade?
Answer: result, question, selected_answer_alternative, answer_text, score
To achieve your goal, I would augment this by writing the generated evaluation to a table and pointing to it from Reault. You could also record the selection and presentation criteria so you can re-generate the assessment programmatically (ie. if you're selecting the questions from a larger question db and re-ordering the answer alternatives before presenting them to the participant).

Export / import tree (id's conflicts)

Let's assume we have a table in database with the following structure:
id (int32), parentId (int32), nodeName, nodeBodyText, ...
Of course some kind of "tree" is stored there.
User exports some branch of the tree to csv/xml/etc file.
When this file is being imported to another db (with a different nodes of course) there often may happen id's conflicts.
1) Records with the same id's may exist already
2) Db has the id column with the auto-incrementing enabled
(so you can't explicitly specify id for newly created record)
How this problem is usually solved?
Especially in case nodeBodyText also may contain text with relations to other nodes
(using hardcoded ids from a previous db)
P.S.
Usage of guid's is not acceptable for us.
Assuming that the imported subtree has parent references confined to that subtree only and you are inserting the nodes only, not updating. In SQL server you can do this:
You need a mapping table to store new and old ids.
declare #idmap table
(
old_id int, new_id int
)
Then insert imported nodes using MERGE command
MERGE [target] as t
USING [source] as s ON 1=0 -- don't match anythig, all nodes are new
WHEN NOT MATCHED
THEN INSERT(parentid,nodename) VALUES(s.parentid,s.nodename)
OUTPUT s.id, inserted.id INTO #idmap; -- store new and old id in mapping table
Finally re-map target table's parent ids
update t
set parentid = x.new_id
from [target] t
inner join #idmap x on x.old_id = t.parentid
where t.parentid is not null
and -- only the newly inserted nodes
exists(select * from #idmap where new_id = t.id);

Enumerated text columns in SQL

I have a number of tables that have text columns that contain only a few different distinct values. I often play the tradeoff between the benefits (primarily reduced row size) of extracting the possible values into a lookup table and storing a small index in the table against the amount of work required to do so.
For the columns that have a fixed set of values known in advance (enumerated values), this isn't so bad, but the more painful case is when I know I have a small set of unique values, but I don't know in advance what they will be.
For example, if I have a table that stores log information on different URLs in a web application:
CREATE TABLE [LogData]
(
ResourcePath varchar(1024) NOT NULL,
EventTime datetime NOT NULL,
ExtraData varchar(MAX) NOT NULL
)
I waste a lot of space by repeating the for every request. There will be a very number of duplicate entries in this table. I usually end up with something like this:
CREATE TABLE [LogData]
(
ResourcePathId smallint NOT NULL,
EventTime datetime NOT NULL,
ExtraData varchar(MAX) NOT NULL
)
CREATE TABLE [ResourcePaths]
(
ResourcePathId smallint NOT NULL,
ResourceName varchar(1024) NOT NULL
)
In this case however, I no longer have a simple way to append data to the LogData table. I have to a lookup on the resource paths table to get the Id, add it if it is missing, and only then can I perform the actual insert. This makes the code much more complicated and changes my write-only logging function to require some sort of transacting against the lookup table.
Am I missing something obvious?
If you have a unique index on ResourseName, the lookup should be very fast even on a big table. However, it has disadvantages. For instance, if you log a lot of data and have to archive it off periodically and want to archive the previous month or year of logdata, you are forced to keep all of resoursepaths. You can come up with solutions for all of that.
yes inserting from existing data doing the lookup as part of the insert
Given #resource, #time and #data as inputs
insert( ResourcePathId, EventTime, ExtraData)
select ResourcePathId, #time, #data
from ResourcePaths
where ResourceName = #resource