if statement in excel, adding 1 if cell with text but - vba

I am creating an excel sheet that has three columns. Detail, month and month count
1 -- I would like for the formula to look at the detail column and if there is text add the previous cell number plus 1 to new month count, if not insert 0
2-- I would like the formula to add the previous cell before the cell with 0 and for the cell with 0 not to impact the other cells or reset the cells back to 1 witch is the problem am having
3-- I also need the formula to reset for every month from what ever number it was back to 0 or 1 depending if the new month first cell has text or not. for this I need the formula to look at the month column
This is what I have so far:
=IF(ISTEXT(G95), I94+ 1, 0)

The formula for the count column should be as follows.
=IF(A2<>"",COUNTIF($B$1:B2,B2)-COUNTIFS($A$1:A2,"",$B$1:B2,B2),0)
Breakdown of how this works:
A2<>"" Will check if the detail column is populated
COUNTIF($B$1:B2,B2) will figure out how many entries are above this row that reference the same month.
COUNTIFS($A$1:A2,"",$B$1:B2,B2) Will find how many cells are blank provided that it also matches the month. This subtracted from the previous section gives you how many are not blank.
The IF will return 0 if the detail is empty.
Which returned the following data
Orderly Random
Det Mon Count Det Mon Count
X 1 1 2 0
X 1 2 X 1 1
X 1 3 X 1 2
1 0 2 0
X 1 4 X 2 1
X 2 1 X 1 3
X 2 2 X 1 4
2 0 1 0
2 0 1 0
2 0 2 0
3 0 3 0
X 3 1 X 3 1
3 0 1 0
X 3 2 3 0
X 3 3 X 1 5
3 0 X 2 2
X 3 4 X 3 2
3 0 3 0
X 3 5 3 0
X 3 6 2 0

It sounds like you want to keep a running total for the month count in the column and put a 0 if there is not text. If that is the case, you can put this formula in I95.
=IF(ISTEXT(G95),MAX($I$2:I94)+1, 0)

Related

How to count unique combinations of Co-ordinates to find most customers in grid section

I have a customer table with their closest delivery hub on a grid based system and need to calculate what is the most populated area using a query.
This is the current query I have that lists all of the Co-ordinates per Customer.
SELECT Customers.HubID, TO_CHAR(Hubs.HubCoordX, 'FM999999999999') as "X Co-ordinate", TO_CHAR(Hubs.HubCoordX, 'FM999999999999') AS "Y Co-ordinate" FROM Customers INNER JOIN Hubs ON Customers.HubID = Hubs.DestinationID ORDER BY Hubs.HubCoordX, Hubs.HubCoordY
This query creates the following result.
HubID
X Co-ord
Y Co-ord
9
-3
1
11
-2
18
2
0
0
3
0
0
3
0
0
1
0
0
1
0
0
3
0
0
4
3
1
5
3
1
7
7
3
But I need a result like this
X Co-ordinate
Y Co-ordinate
Population
-3
1
1
-2
18
1
0
0
6
3
1
2
7
3
1
Thanks in advance
I have attempted use Count Unique however it resulted in only counting individual Co-ordinates once.
SELECT TO_CHAR(Hubs.HubCoordX, 'FM999999999999') as "X Co-ordinate",
TO_CHAR(Hubs.HubCoordX, 'FM999999999999') AS "Y Co-ordinate", Count(HubID) as "Population"
FROM Customers
INNER JOIN Hubs ON Customers.HubID = Hubs.DestinationID
Group BY Hubs.HubCoordX, Hubs.HubCoordY

rolling sum of a column in pandas dataframe at variable intervals

I have a list of index numbers that represent index locations for a DF. list_index = [2,7,12]
I want to sum from a single column in the DF by rolling through each number in list_index and totaling the counts between the index points (and restart count at 0 at each index point). Here is a mini example.
The desired output is in OUTPUT column, which increments every time there is another 1 from COL 1 and RESTARTS the count at 0 on the location after the number in the list_index.
I was able to get it to work with a loop but there are millions of rows in the DF and it takes a while for the loop to run. It seems like I need a lambda function with a sum but I need to input start and end point in index.
Something like lambda x:x.rolling(start_index, end_index).sum()? Can anyone help me out on this.
You can try of cummulative sum and retrieving only 1 values related information , rolling sum with diffferent intervals is not possible
a = df['col'].eq(1).cumsum()
df['output'] = a - a.mask(df['col'].eq(1)).ffill().fillna(0).astype(int)
Out:
col output
0 0 0
1 1 1
2 1 2
3 0 0
4 1 1
5 1 2
6 1 3
7 0 0
8 0 0
9 0 0
10 0 0
11 1 1
12 1 2
13 0 0
14 0 0
15 1 1

how to calculate the specific accumulated amount in t-sql

For each row, I need to calculate the integer part from dividing by 4. For each subsequent row, we add the remainder of the division by 4 previous and current lines and look at the whole part and the remainders from dividing by 4. Consider the example below:
id val
1 22
2 1
3 1
4 2
5 1
6 6
7 1
After dividing by 4, we look at the whole part and the remainders. For each id we add up the accumulated points until they are divided by 4:
id val wh1 rem1 wh2 rem2 RESULT(wh1+wh2)
1 22 5 2 0 2 5
2 1 0 1 (3/4=0) 3%4=3 0
3 1 0 1 (4/4=1) 4%4=0 1
4 2 0 2 (2/4=0) 2%4=2 0
5 1 0 1 (3/4=0) 3%4=3 0
6 7 1 2 (5/4=1) 5%4=1 2
7 1 0 1 (2/4=0) 2%4=1 0
How can I get the next RESULT column with sql?
Data of project:
http://sqlfiddle.com/#!18/9e18f/2
The whole part from the division into 4 is easy, the problem is to calculate the accumulated remains for each id, and to calculate which of them will also be divided into 4

Assigning one column to another column between pandas DataFrames (like vector to vector assignment)

I have a super strange problem which I spent the last hour trying to solve, but with no success. It is even more strange since I can't replicate it on a small scale.
I have a large DataFrame (150,000 entries). I took out a subset of it and did some manipulation. the subset was saved as a different variable, x.
x is smaller than the df, but its index is in the same range as the df. I'm now trying to assign x back to the DataFrame replacing values in the same column:
rep_Callers['true_vpID'] = x.true_vpID
This inserts all the different values in x to the right place in df, but instead of keeping the df.true_vpID values that are not in x, it is filling them with NaNs. So I tried a different approach:
df.ix[x.index,'true_vpID'] = x.true_vpID
But instead of filling x values in the right place in df, the df.true_vpID gets filled with the first value of x and only it! I changed the first value of x several times to make sure this is indeed what is happening, and it is. I tried to replicate it on a small scale but it didn't work:
df = DataFrame({'a':ones(5),'b':range(5)})
a b
0 1 0
1 1 1
2 1 2
3 1 3
4 1 4
z =Series([random() for i in range(5)],index = range(5))
0 0.812561
1 0.862109
2 0.031268
3 0.575634
4 0.760752
df.ix[z.index[[1,3]],'b'] = z[[1,3]]
a b
0 1 0.000000
1 1 0.812561
2 1 2.000000
3 1 0.575634
4 1 4.000000
5 1 5.000000
I really tried it all, need some new suggestions...
Try using df.update(updated_df_or_series)
Also using a simple example, you can modify a DataFrame by doing an index query and modifying the resulting object.
df_1
a b
0 1 0
1 1 1
2 1 2
3 1 3
4 1 4
df_2 = df_1.ix[3:5]
df_2.b = df_2.b + 2
df_2
a b
3 1 5
4 1 6
df_1
a b
0 1 0
1 1 1
2 1 2
3 1 5
4 1 6

How to calculate the number of pairs in an Excel spreadsheet?

I have two columns of integers between 1 and 16 in an excel file. I'd like to count the number of pairs of integers in these columns. There are 256 cases and I'd like to have a column which tells me how many pairs exist for each case. For instance, I have a table like below:
1 2
1 1
1 3
1 4
1 1
1 8
1 1
16 16
1 2
...
And I'd like to calculate a column like this:
3 (number of 1 1s)
2 (number of 1 2s)
1 (number of 1 3s)
1 (number of 1 4s)
0 (number of 1 5s)
0 (number of 1 6s)
0 (number of 1 7s)
1 (number of 1 8s)
...
1 (number of 16 16s)
I'd appreciate if someone can help me with the calculation.
First you need to create two columns with all possible combinations:
1 1
1 2
1 3
...
2 1
2 2
...
16 16
Let's assume these are in columns C,D and your data are in columns A, B, in rows 1 to 1000. Then you can use an array formula:
=SUM(IF(($A$1:$A$1000=C1)*($B$1:$B$1000=D1);1;0))
You must press Shift+Ctrl+Enter when entering array formula.