invalid column number in Formatfile sql - sql

I looked at this one (Bulk inserting a csv in SQL using a formatfile to remove double quotes) but my situation is just different enough.
First, how would I upload this very lengthy format file? Looking at Github but just not clear enough.
This is the error I get
bulk insert equi2022a
From 'C:\Users\someone\Desktop\equi.txt'
WITH (FORMATFILE = 'C:\Users\someone\Desktop\formatfileequi-2.txt'
);
Msg 4823, Level 16, State 1, Line 1
Cannot bulk load. Invalid column number in the format file
"C:\Users\someone\Desktop\formatfileequi-2.txt".
I created this manually and here is a small snippet of it. I painstakingly went through every row to make sure that it was in perfect order....1,2,3,4,... and so on until 122 in both of the columns designated for this.
11.0
122
1 SQLCHAR 0 01 "" 1 transcode ""
2 SQLCHAR 0 02 "" 2 stfips ""
3 SQLCHAR 0 04 "" 3 year ""
4 SQLCHAR 0 01 "" 4 qtr ""
5 SQLCHAR 0 10 "" 5 uiacct ""
6 SQLCHAR 0 05 "" 6 run ""
7 SQLCHAR 0 09 "" 7 ein ""
8 SQLCHAR 0 10 "" 8 presesaid ""
9 SQLCHAR 0 05 "" 9 predrun ""
10 SQLCHAR 0 10 "" 10 succuiacct ""
11 SQLCHAR 0 05 "" 11 succrun ""
12 SQLCHAR 0 35 "" 12 legalname ""
and then the ending
115 SQLCHAR 0 10 "" 115 wrlargestcontribsucc""
116 SQLCHAR 0 06 "" 116 wrcountlargestcontrib""
117 SQLCHAR 0 06 "" 117 wrhires ""
118 SQLCHAR 0 06 "" 118 wrseparations""
119 SQLCHAR 0 06 "" 119 wrnewentrants""
120 SQLCHAR 0 60 "" 120 wrexits ""
121 SQLCHAR 0 06 "" 121 wrcontrecords""
122 SQLCHAR 0 78 "" 122 Blank7 ""

Related

how to sum vlaues in dataframes based on index match

I have about 16 dataframes representing weekly users' clickstream data. The photos show the samples for weeks from 0-3. I want to make a new dataframe in this way: for example if a new df is w=2, then w2=w0+w1+w2. For w3, w3=w0+w1+w2+3. As you can see the datasets do not have identical id_users, but id a user does not show in a certain week. All dataframes have the same columns, but indexes are not exactly same. So how to add based on the logic where indexes match?
id_user c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11
43284 1 8 0 8 5 0 0 0 2 3 1
45664 0 16 0 4 0 0 0 0 5 16 2
52014 0 0 0 5 4 0 0 0 0 2 2
53488 1 37 0 19 0 0 3 0 3 23 6
60135 0 124 0 87 3 0 24 0 8 19 14
id_user c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11
40419 0 8 0 3 4 0 6 0 1 6 0
43284 1 4 0 14 26 2 0 0 2 4 2
45664 0 9 0 15 11 0 0 0 1 6 14
52014 0 0 0 8 9 0 8 0 2 2 1
53488 0 2 0 4 0 0 4 0 0 0 0
id_user c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11
40419 0 8 0 3 4 0 6 0 1 6 0
43284 1 4 0 14 26 2 0 0 2 4 2
45664 0 9 0 15 11 0 0 0 1 6 14
52014 0 0 0 8 9 0 8 0 2 2 1
53488 0 2 0 4 0 0 4 0 0 0 0
concat then groupby sum
out = pd.concat([df1,df2]).groupby('id_user',as_index=False).sum()
Out[147]:
id_user c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11
0 40419 0 8 0 3 4 0 6 0 1 6 0
1 43284 2 12 0 22 31 2 0 0 4 7 3
2 45664 0 25 0 19 11 0 0 0 6 22 16
3 52014 0 0 0 13 13 0 8 0 2 4 3
4 53488 1 39 0 23 0 0 7 0 3 23 6
5 60135 0 124 0 87 3 0 24 0 8 19 14

BCP error if last column is empty

My bcp app is getting an error if the last column in the datafile is empty. This causes an error:
XX,YY,42,0,2,201501,652,,
This doesn't:
XX,YY,42,0,2,201501,652,,0
Unfortunately I can't specify a zero instead of a null. The destination table allows null on every column. The datatype is float (the last three columns are floats in fact). Here's the format file:
8.0
9
1 SQLCHAR 0 10 "," 1 NOT SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 10 "," 2 VIOLATING SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 10 "," 3 COMPANY SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 2 "," 4 POLICY SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 2 "," 5 ON SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 6 "," 6 INFORMATION SQL_Latin1_General_CP1_CI_AS
7 SQLCHAR 0 25 "," 7 SECURITY ""
8 SQLCHAR 0 25 "," 8 QTY2 ""
9 SQLCHAR 0 25 "\n" 9 QTY1 ""
The error:
Row 1, Column 9: Invalid character value for cast specification

Creaing user input to manipulate data with a set of criterias

I am having problems with a VBA Excel 2010 program code.
I am trying to read data from a spreadsheet on excel 2010. what I have is a set of data (see below) and I am trying to write a code that will let me use a msg box and ask me to write down the name I am looking for e.g. "Name 1" from the list of names in the column and then I want to set a criteria where if the number in the columns are equal to zero and again for a different column = 0 ("name 5"), then highlight red any number in column "Name 8 and Name 9" that is greater than let say 30 (just a random example). the important thing is, the red highlight of column "Name 8/9" must only occur if the numbers is row "Name 1" and "Name 5" are equal to zero.
I have already done this but I only used the column numbers e.g. A1:A5. now I need to use the name of the column because I want to use the code for a different excel spreadsheets but the names of columns are in different positions for each excel sheet, but if I use the names, no matter which column along excel they are I will always find the right column I am looking for and set the criteria.
the criteria for "Name 1/5" will always be = 0 or =1 but the program has to ask me to choose that when I search for it.
if you look below at the example, you can see the red highlight are when criteria of =0 is met for Name 1 and Name 5 and the number in Name8/9 are greater than 30. when it is not greater than 30 and it still meets the criteria it is highlighted blue in the excel spreadsheet example. ALL OTHER NAMES MUST BE IGNORED.
SEE EXAMPLE BELOW
Name 1 Name 2 Name 3 Name 4 Name 5 Name 6 Name 7 Name 8 Name 9 Name 10
0 0 1 0 0 1 58 35 14 19
0 0 0 0 0 1 41 45 68 74
1 0 1 0 1 0 23 18 98 87
0 0 1 0 0 1 65 36 52 89
0 0 0 0 1 1 24 95 47 75
1 1 1 0 1 0 58 87 59 14
0 1 0 0 0 0 74 41 84 32
1 1 0 0 1 0 96 25 74 96
0 0 0 0 0 0 87 35 15 53
0 0 1 0 0 1 57 49 48 47
1 0 1 0 1 1 63 84 23 65
0 1 0 0 0 0 21 54 69 12
0 0 1 0 0 0 54 23 54 54
1 1 0 0 1 1 88 34 77 88
0 0 1 0 0 0 78 48 68 69
1 0 1 0 0 1 96 87 14 65
1 0 0 0 1 0 21 96 54 25
0 1 0 0 0 0 54 72 78 29
0 1 1 0 0 1 62 38 22 78
0 0 0 0 0 0 21 49 65 54
1 0 1 0 1 1 17 65 98 99
0 0 0 0 0 0 59 15 56 70
0 1 1 0 0 0 36 12 29 54
1 0 0 0 1 0 29 49 55 54
Code:
Private Sub CommandButton21_Click()
Cells.Interior.ColorIndex = 0
For Each rw In Range("A1:V22").Rows
If Application.Sum(rw.Resize(, 4)) = 0 Then
cll.Interior.ColorIndex = 3
For Each cll In rw.Offset(, 4).Resize(, 18).Cells
If cll.Value > 50 Then cll.Interior.ColorIndex = 3
Next cll
End If
Next rw
End Sub
If I'm reading right what you want, you could try this. This will ask you to input the name and will then go through your motions on that particular column as a range for the loops. Is that what you are after?
Also, I've changed
If Application.Sum(rw.Resize(, 4)) = 0 Then
cll.Interior.ColorIndex = 3
To rw.Interior.Colorindex = 3 - as I'm guessing this was an error (as you can't use a variable outside of its loop
Private Sub CommandButton21_Click()
searchstring = InputBox("Input name?")
Set coll = Rows(1).Find(What:=searchstring, LookIn:=xlValues, LookAt:=xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False)
If coll Is Nothing Then
MsgBox "Name not found"
Exit Sub
Else
coll = coll.Column
Lrow = Range(Cells(2, coll), Cells(2, coll)).CurrentRegion.Rows.Count
End If
Cells.Interior.ColorIndex = 0
For Each rw In Range(Cells(2, coll), Cells(Lrow, coll))
If Application.Sum(rw.Resize(, 4)) = 0 Then
rw.Interior.ColorIndex = 3
For Each cll In rw.offset(, 4).Resize(, 18).Cells
If cll.Value > 50 Then cll.Interior.ColorIndex = 3
Next cll
End If
Next rw
End Sub

How do would you split this given NSString into a NSDictionary?

I have some data i aquire from some linux box and want to put it into a NSDictionary for later processing.
How wold you get this NSString into a NSDictionary like the following?
data
(
bytes
(
60 ( 1370515694 )
48 ( 812 )
49 ( 300 )
...
)
pkt
(
60 ( 380698 )
59 ( 8 )
58 ( 412 )
...
)
block
(
60 ( 5 )
48 ( 4 )
49 ( 7 )
...
)
drop
(
60 ( 706 )
48 ( 2 )
49 ( 4 )
...
)
session
(
60 ( 3 )
48 ( 1 )
49 ( 2 )
...
)
)
The data string looks like:
//time bytes pkt block drop session
60 1370515694 380698 5 706 3
48 812 8 4 2 1
49 300 412 7 4 2
50 0 0 0 0 0
51 87 2 0 0 0
52 87 2 0 0 0
53 0 0 0 0 0
54 0 0 0 0 0
55 0 0 0 0 0
56 0 0 0 0 0
57 812 8 0 0 0
58 812 8 0 0 0
59 0 0 0 0 0
0 0 0 0 0 0
1 2239 12 2 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
6 0 0 0 0 0
7 2882 19 2 0 0
8 4906 29 4 0 0
9 1844 15 11 0 0
10 4210 29 17 0 0
11 3370 18 4 0 0
12 3370 18 4 0 0
13 1184 7 3 0 0
14 0 0 0 0 0
15 4046 19 3 0 0
16 4956 23 3 0 0
17 2960 18 2 0 0
18 2960 18 2 0 0
19 1088 6 2 0 0
20 0 0 0 0 0
21 3261 17 3 0 0
22 3261 17 3 0 0
23 1228 6 2 0 0
24 1228 6 2 0 0
25 2628 17 2 0 0
26 4688 26 3 0 0
27 1752 13 5 0 0
28 3062 21 5 0 0
29 174 2 2 0 0
30 96 1 1 0 0
31 4351 23 5 0 0
32 0 0 0 0 0
33 4930 23 7 0 0
34 6750 31 7 0 0
35 1241 6 2 0 0
36 1241 6 2 0 0
37 3571 29 2 0 0
38 0 0 0 0 0
39 1010 5 1 0 0
40 1010 5 1 0 0
41 88859 72 3 0 1
42 90783 81 4 0 1
43 2914 19 3 0 0
44 0 0 0 0 0
45 2157 17 1 0 0
46 2157 17 1 0 0
47 78 1 1 0 0
.
Time (first column) should be the key for the sub-sub-dictionaries.
So the idea behind all that is that i can later randmly access the PKT value at a given TIME x, as well as the BLOCK amount at TIME y, and SESSION value at TIME z .. and so on..
Thanks in advance
You probably don't want a dictionary but an array containing dictionaries of all the data entries. The simplest way to parse something like this in Objective-C is to use the componentsSeparatedByString method in NSString
NSString* dataString = <Your Data String> // Assumes the items are separated by newlines
NSArray* items = [dataString componentsSeparatedByString:#"\n"];
NSMutableArray* dataDictionaries = [NSMutableArray array];
for (NSString* item in items) {
NSArray* elements = [item componentsSeparatedByString:#" "];
NSDictionary* entry = #{
#"time": [elements objectAtIndex:0],
#"bytes": [elements objectAtIndex:1],
#"pkt": [elements objectAtIndex:2],
#"block": [elements objectAtIndex:3], #"drop": [elements objectAtIndex:4],
#"session": [elements objectAtIndex:5],
};
[dataDictionaries addObject: entry];
}

BCP file format for SQL bulk insert of CSV file

I'm trying a bulk insert of a csv file into a SQL table using BCP but can't fix this error: "The column is too long in the data file for row 1, column 2. Verify that the field terminator and row terminator are specified correctly." - Can anyone help please?
Here's my SQL code:
BULK INSERT UKPostCodeStaging
FROM 'C:\Users\user\Desktop\Data\TestFileOf2Records.csv'
WITH (
DATAFILETYPE='char',
FIRSTROW = 1,
FORMATFILE = 'C:\Users\User\UKPostCodeStaging.fmt');
Here's my test data contained in TestFileOf2Records.csv:
"HS1 2AA",10,14,93,"S923","","S814","","S1213","S132605"
"HS1 2AD",10,14,93,"S923","","S814","","S1213","S132605"
And here's my BCP file that I have attempted to edit appropriately:
10.0
11
1 SQLCHAR 0 0 "\"" 0 FIRST_QUOTE SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 8 0 "\"," 1 PostCode SQL_Latin1_General_CP1_CI_AS
3 SQLINT 1 0 "," 2 PositionalQualityIndicator ""
4 SQLINT 1 0 "," 3 MetresEastOfOrigin ""
5 SQLINT 1 0 ",\"" 4 MetresNorthOfOrigin ""
6 SQLCHAR 8 0 "\",\"" 5 CountryCode SQL_Latin1_General_CP1_CI_AS
7 SQLCHAR 8 0 "\",\"" 6 NHSRegionalHACode SQL_Latin1_General_CP1_CI_AS
8 SQLCHAR 8 0 "\",\"" 7 NHSHACode SQL_Latin1_General_CP1_CI_AS
9 SQLCHAR 8 0 "\",\"" 8 AdminCountyCode SQL_Latin1_General_CP1_CI_AS
10 SQLCHAR 8 0 "\",\"" 9 AdminDistrictCode SQL_Latin1_General_CP1_CI_AS
11 SQLCHAR 8 0 "\"\r\n" 10 AdminWardCode SQL_Latin1_General_CP1_CI_AS
Any ideas where I am going wrong?
thanks