How to gather all file names in a folder and subdirectories into a SQL Server table - sql

I would like to know how to do this.
For example:
I have c:/temp/.
Inside this temp folder I have various files and folders in various structure.
What would be the easiest way to gather all file names inside temp and its subdirectories and then insert them into a table?
I am planning the table structure will be simple.
It will have:
Primary key
Path and filename
CreatedDate
ModifiedDate
DeleteDate
So the table would look something like this:
Key | PathFilename | Modified | Created | Delete |
1 | c:\temp\fil7.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
2 | c:\temp\fi5e.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
3 | c:\temp\1ile.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
4 | c:\temp\2ile.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
5 | c:\temp\3ile.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
6 | c:\temp\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
7 | c:\temp\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
8 | c:\temp\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
9 | c:\temp\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
10 | c:\temp\folde1\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
11 | c:\temp\folde2\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
12 | c:\temp\folde4\file.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
13 | c:\temp\folder\fil5.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
14 | c:\temp\folder\fil6.txt | 2013/02/01 | 2013/02/01 | 1900/01/01|
Can I do this with a SSIS job? Or is there any other solution that can accomplish this task?
Is there any tutorial on how to do this step by step?
Thank you
ps: I have a filesystemWatcher vb.net program that will watch any created files and modified files
but for the initial start, I would like to fill up the table with files that already exists. I don't know if filesystemWatcher can do this initial task? can it?

I would create a Variable, FolderSource of type String and assign it your value of c:\temp.
While you can do all of this in a single Script Task which is an object on the Control Flow, I am going to describe how to do it with the Data Flow Task as that might be a better construct for learning how SSIS generally works. Drag a Data Flow Task onto the canvas. Double click on it.
Inside your Data Flow Task, add a Script Component. I add a reference to the Variable FolderSource as ReadOnly. In the Inputs and Outputs, I renamed the Output buffer to FS and added the columns below. Data types were 4 byte integer, string 255 and then date (DT_DATE).
public override void CreateNewOutputRows()
{
string src = Variables.FolderSource;
int key = 1;
System.IO.FileInfo fileInfo = null;
foreach (string currentFile in System.IO.Directory.EnumerateFiles(src, "*.*", SearchOption.AllDirectories))
{
fileInfo = new FileInfo(currentFile);
FSBuffer.AddRow();
FSBuffer.Key = key++;
FSBuffer.PathFilename = currentFile;
// Have UTC flavored methods too
FSBuffer.Created = fileInfo.CreationTime;
FSBuffer.Modified = fileInfo.LastWriteTime;
FSBuffer.Delete = new DateTime(1900, 1, 1);
}
}
That'll get the data streaming down your data flow. If you need to do anything with the data, you would add various components now.
Once you've manipulated the rows of data you'll need to land them somewhere. There are a host of destinations available but you'll probably only want to use the OLE DB Destination component. Connect the output of the Script Task, or any subsequent task(s) you used, to the destination. Double click on it and that will allow you to specify the database connection, the table name and the mapping of columns---in that order.
You probably don't have an OLE DB Connection Manager defined so click the Connection Manager button in the destination and create a new one. After creating a Connection Manager, you'll select the table where the data should reside. Then on the Columns tab, map the source columns (from the Script Component) to the destination (table).

Related

SQL to set a value based on a value from a diffrent table automatically

The title may not be that helpful but what I am trying to do is this.
For simplicity's sake I have two tables one called logs and another called Log controls
In LOGS I have and a log event column, this is automatically populated by imported information. On the LOG CONTROLS I have a manually entered list of Log events (to match the ones coming in) and I have this table to have them assigned ID numbers and other details about the event.
What I need to do is have a column in the LOGS table which looks at the Log events, matches it to the ID from the LOG CONTROLS table and assigns the ID into the LOGS table.
I have seen a few methods of changing information in columns based of information in other tables but all of these seem to be one way checks i.e if ID = X change to VALUE FROM OTHER TABLE where as what I need is IF VALUE = X FROM OTHER TABLE CHANGE ID FIELD TO = Y FROM OTHER TABLE
Below is a mock up of the tables.
+----+-----------+----------+------------+
| ID | Date_Time | Event | Control ID|
+----+-----------+----------+------------+
| 1 | 0/0/0 | Shutdown | |
| 2 | 0/0/0 | Start up | |
| 3 | 0/0/0 | Error | |
| 4 | 0/0/0 | Info | |
| 5 | 0/0/0 | Shutdown | |
| 6 | 0/0/0 | Error | |
+----+-----------+----------+------------+
+-------------------+----------+--------+-------+
| Control ID | Event | Export | Flag |
+-------------------+----------+--------+-------+
| 1 | Shutdown | TRUE | TRUE |
| 2 | Start up | TRUE | FALSE |
| 3 | Error | TRUE | TRUE |
| 4 | Info | TRUE | FALSE |
+-------------------+----------+--------+-------+
So I need the Control ID in the first table to match the control ID from the second table depending on what the event was.
I hope this makes sense.
Any help or advice would be greatly appreciated.
From your description, it seems that a simple UPDATE statement is all you need:
update logs
set control_id = c.control_id
from log_controls as c
where c.event = logs.event;

Problems with using the bootstrap-datepicker in Fitnesse Tests

In my Fitnesse Tests I want to enter dates through datepicker elements. Sometimes it works. But most of the time a different date, unlike the date that was entered, appears. Here is an example:
| ensure | do | type | on | id=field_id | with | |
| ensure | do | type | on | id=field_id | with | 05.05.1997 |
| check | is | verifyValue | on | id=field_id | [28.05.1997] expected [05.05.1997] |
(To make sure that the field isn't already filled, I pass an empty String first.)
Mostly, the 'day'-statement is different from what was entered. Do you know the reason for this behavior? How can I solve this?
Thanks in advance!
This is related to how you wrote your fixture and not FitNesse, the problem is that it returns a different value and also implies that the previous line didn't work - | ensure | do | type | on | id=field_id | with | 05.05.1997 |

Last accessed timestamp of a Netezza table?

Does anyone know of a query that gives me details on the last time a Netezza table was accessed for any of the operations (select, insert or update) ?
Depending on your setup you may want to try the following query:
select *
from _v_qryhist
where lower(qh_sql) like '%tablename %'
There are a collection of history views in Netezza that should provide the information you require.
Netezza does not track this information in the catalog, so you will typically have to mine that from the query history database, if one is configured.
Modern Netezza query history information is typically stored in a dedicated database. Depending on permissions, you may be able to see if history collection is enabled, and which database it is using with the following command. Apologies in advance for the screen-breaking wrap to come.
SYSTEM.ADMIN(ADMIN)=> show history configuration;
CONFIG_NAME | CONFIG_DBNAME | CONFIG_DBTYPE | CONFIG_TARGETTYPE | CONFIG_LEVEL | CONFIG_HOSTNAME | CONFIG_USER | CONFIG_PASSWORD | CONFIG_LOADINTERVAL | CONFIG_LOADMINTHRESHOLD | CONFIG_LOADMAXTHRESHOLD | CONFIG_DISKFULLTHRESHOLD | CONFIG_STORAGELIMIT | CONFIG_LOADRETRY | CONFIG_ENABLEHIST | CONFIG_ENABLESYSTEM | CONFIG_NEXT | CONFIG_CURRENT | CONFIG_VERSION | CONFIG_COLLECTFILTER | CONFIG_KEYSTORE_ID | CONFIG_KEY_ID | KEYSTORE_NAME | KEY_ALIAS | CONFIG_SCHEMANAME | CONFIG_NAME_DELIMITED | CONFIG_DBNAME_DELIMITED | CONFIG_USER_DELIMITED | CONFIG_SCHEMANAME_DELIMITED
-------------+---------------+---------------+-------------------+--------------+-----------------+-------------+---------------------------------------+---------------------+-------------------------+-------------------------+--------------------------+---------------------+------------------+-------------------+---------------------+-------------+----------------+----------------+----------------------+--------------------+---------------+---------------+-----------+-------------------+-----------------------+-------------------------+-----------------------+-----------------------------
ALL_HIST_V3 | NEWHISTDB | 1 | 1 | 20 | localhost | HISTUSER | aFkqABhjApzE$flT/vZ7hU0vAflmU2MmPNQ== | 5 | 4 | 20 | 0 | 250 | 1 | f | f | f | t | 3 | 1 | 0 | 0 | | | HISTUSER | f | f | f | f
(1 row)
Also make note of the CONFIG_VERSION, as it will come into play when crafting the following query example. In my case, I happen to be using the version 3 format of the query history database.
Assuming history collection is configured, and that you have access to the history database, you can get the information you're looking for from the tables and views in that database. These are documented here. The following is an example, which reports when the given table was the target of a successful insert, update, or delete by referencing the "usage" column. Here I use one of the history table helper functions to unpack that column.
SELECT FORMAT_TABLE_ACCESS(usage),
hq.submittime
FROM "$v_hist_queries" hq
INNER JOIN "$hist_table_access_3" hta
USING (NPSID, NPSINSTANCEID, OPID, SESSIONID)
WHERE hq.dbname = 'PROD'
AND hta.schemaname = 'ADMIN'
AND hta.tablename = 'TEST_1'
AND hq.SUBMITTIME > '01-01-2015'
AND hq.SUBMITTIME <= '08-06-2015'
AND
(
instr(FORMAT_TABLE_ACCESS(usage),'ins') > 0
OR instr(FORMAT_TABLE_ACCESS(usage),'upd') > 0
OR instr(FORMAT_TABLE_ACCESS(usage),'del') > 0
)
AND status=0;
FORMAT_TABLE_ACCESS | SUBMITTIME
---------------------+----------------------------
ins | 2015-06-16 18:32:25.728042
ins | 2015-06-16 17:46:14.337105
ins | 2015-06-16 17:47:14.430995
(3 rows)
You will need to change the digit at the end of the $v_hist_table_access_3 view to match your query history version.

SQL - Combining 3 rows per group in a logging scenario

I have reworked our API's logging system to use Azure Table Storage from using SQL storage for cost and performance reasons. I am now migrating our legacy logs to the new system. I am building a SQL query per table that will map the old fields to the new ones, with the intention of exporting to CSV then importing into Azure.
So far, so good. However, one artifact of the previous system is that it logged 3 times per request - call begin, call response and call end - and the new one logs the call as just one log (again, for cost and performance reasons).
Some fields common are common to all three related logs, e.g. the Session which uniquely identifies the call.
Some fields I only want the first log's value, e.g. Date which may be a few seconds different in the second and third log.
Some fields are shared for the three different purposes, e.g. Parameters gives the Input Model for Call Begin, Output Model for Call Response, and HTTP response (e.g. OK) for Call End.
Some fields are unused for two of the purposes, e.g. ExecutionTime is -1 for Call Begin and Call Response, and a value in ms for Call End.
How can I "roll up" the sets of 3 rows into one row per set? I have tried using DISTINCT and GROUP BY, but the fact that some of the information collides is making it very difficult. I apologize that my SQL isn't really good enough to really explain what I'm asking for - so perhaps an example will make it clearer:
Example of what I have:
SQL:
SELECT * FROM [dbo].[Log]
Results:
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
| Session | Date | Level | Context | Message | ExecutionTime | Parameters | |
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call Begin | -1 | {"Input":"xx"} | |
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call Response | -1 | {"Output":"yy"} | |
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call End | 123 | OK | |
| F76BCBB | 2014-07-20 19:16:17 | ERROR | GET v1/def | Call Begin | -1 | {"Input":"ww"} | |
| F76BCBB | 2014-07-20 19:16:18 | ERROR | GET v1/def | Call Response | -1 | {"Output":"vv"} | |
| F76BCBB | 2014-07-20 19:16:18 | ERROR | GET v1/def | Call End | 456 | BadRequest | |
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
Example of what I want:
SQL:
[Need to write this query]
Results:
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
| Date | Level | Context | Message | ExecutionTime | InputModel | OutputModel | HttpResponse |
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
| 2014-07-20 19:16:15 | INFO | GET v1/abc | Api Call | 123 | {"Input":"xx"} | {"Output":"yy"} | OK |
| 2014-07-20 19:16:17 | ERROR | GET v1/def | Api Call | 456 | {"Input":"ww"} | {"Output":"vv"} | BadRequest |
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
select L1.Session, L1.Date, L1.Level, L1.Context, 'Api Call' AS Message,
L3.ExecutionTime,
L1.Parameters as InputModel,
L2.Parameters as OutputModel,
L3.Parameters as HttpResponse
from Log L1
inner join Log L2 ON L1.Session = L2.Session
inner join Log L3 ON L1.Session = L3.Session
where L1.Message = 'Call Begin'
and L2.Message = 'Call Response'
and L3.Message = 'Call End'
This would work in your sample.

Unique string table in SQL and replacing index values with string values during query

I'm working on an old SQL Server database that has several tables that look like the following:
|-------------|-----------|-------|------------|------------|-----|
| MachineName | AlarmName | Event | AlarmValue | SampleTime | ... |
|-------------|-----------|-------|------------|------------|-----|
| 3 | 180 | 8 | 6.780 | 2014-02-24 | |
| 9 | 67 | 8 | 1.45 | 2014-02-25 | |
| ... | | | | | |
|-------------|-----------|-------|------------|------------|-----|
There is a separate table in the database that only contains unique strings, as well as the index for each unique string. The unique string table looks like this:
|----------|--------------------------------|
| Id | String |
|----------|--------------------------------|
| 3 | MyMachine |
| ... | |
| 8 | High CPU Usage |
| ... | |
| 67 | 404 Error |
| ... | |
|----------|--------------------------------|
Thus, when we want to get something out of the database, we get the respective rows out, then lookup each missing string based on the index value.
What I'm hoping to do is to replace all of the string indexes with the actual values in a single query without having to do post-processing on the query result.
However, I can't figure out how to do this in a single query. Do I need to use multiple JOINs? I've only been able to figure out how to replace a single value by doing something like -
SELECT UniqueString.String AS "MachineName" FROM UniqueString
JOIN Alarm ON Alarm.MachineName = UniqueString.Id
Any help would be much appreciated!
Yes, you can do multiple joins to the UniqueStrings table, but change the order to start with the table you are reporting on and use unique aliases for the joined table. Something like:
SELECT MN.String AS 'MachineName', AN.String as 'AlarmName' FROM Alarm A
JOIN UniqueString MN ON A.MachineName = MN.Id
JOIN UniqueString AN ON A.AlarmName = AN.Id
etc for any other columns