.replace only replacing first value - pandas

I have the following code:
df_demo['Age'] = df_demo['Age'].replace([23842674135270370,
23842674044440370, 23842674044420370, 23842674044430370],
['18-24', '25-34', '35-44', '45+'])
(The numbers are ad id tags, and I'm trying to replace them to the age groups they are targeting.)
The code is only reading the first number and replacing it (to 18-24). The rest of the numbers are not reading and replacing. If I flip the order of the numbers (like move the 25-34 pairing to the first set) it replaces that first pairing but none of the others.
I have exactly the same construction for .replace() -- using two lists within the () -- further up in my program and it's working perfectly. But this one is not, and I can't figure out why it is not working.

For me working convert column Age to string by dtype and then replace strings by another one:
df_demo = pd.read_csv('demographics - Sheet1.csv', dtype={'Age':str})
print (df_demo.tail())
190 191 23842674135270370 Yes
191 192 23842674135270370 Yes
192 193 23842674044420370 Yes
193 194 23842674135270370 Yes
194 195 23842674044420370 Yes
df_demo['Age'] = df_demo['Age'].replace(
['23842674135270370','23842674044440370','23842674044420370','23842674044430370'],
['18-24', '25-34', '35-44', '45+'])
print (df_demo.tail())
Name Age Newsletter
190 191 18-24 Yes
191 192 18-24 Yes
192 193 35-44 Yes
193 194 18-24 Yes
194 195 35-44 Yes

Related

Generate random numbers from a particular column

I have a dataset in this form
item_id EAN Price
3434 232 34
3233 412 28
There are totally 54344 datapoints.
I want to random print 40 values from EAN. I tried some techniques like
df=pd.read_csv('item_desc.csv')
print(df['EAN'].random.rand(40))
but it doesn't worked. Can someone suggest me the code
you can use sample:
df.sample(n=40)

translate Dataframe using crosswalk in julia

I have a very large dataframe (original_df) with columns of codes
14 15
21 22
18 16
And a second dataframe (crosswalk) which maps 'old_codes' to 'new_codes'
14 104
15 105
16 106
18 108
21 201
22 202
Of course, the resultant df (resultant_df) that I would like would have values:
104 105
201 202
108 106
I am aware of two ways to accomplish this. First, I could iterate through each code in original_df, find the code in crosswalk, then rewrite the corresponding cell in original_df with the translated code from crosswalk. The faster and more natural option would be to leftjoin() each column of original_df on 'old_codes'. Unfortunately, it seems I have to do this separately for each column, and then delete each column after its conversion column has been created -- this feels unnecessarily complicated. Is there a simpler way to convert all of original_df at once using the crosswalk?
You can do the following (I am using column numbers as you have not provided column names):
d = Dict(crosswalk[!, 1] .=> crosswalk[!, 2])
resultant_df = select(original_df, [i => ByRow(x -> d[x]) for i in 1:ncol(original_df)], renamecols=false)

PDF How to get Font object with id not in cross reference table

Like in this discussion,
Tj command with angle brackets
I'm faced with TJ operator where content is between angle brackets:
<00030037005200570044004F000300550048004600520051005100580056>Tj
the parent page gives the list of font object id's like this
Font /C2_0 39 0 R/T1_0 41 0 R/T1_1 43 0 R/T1_2 44 0 R
and for the object where the angle brackets string is, a Tf operator specifies that the font reference is C2_0
So from the font list, I know the C2 font object is 39
Ok, but now, what is the fastest way to access this 39 object that is embedded in a stream object having 16 as id. In this #16 object, there is the list of embedded objects
32 0 33 106 34 131 35 141 36 193 37 436 38 16720 39 16728 ....
So my quetion is how to get the 16 value, when I only know that the font object id 39 is not in the cross reference table? Do I have to parse all stream objects and read their stream object list to detect which one has the object 39?
Thanks for your attention.

Get length from item set

I have a set of Data in this format
# Input Data for items in the format (i,l,w,h)
i for item, l for length , w for width, h for height
set itemData :=
271440 290 214 361
1504858 394 194 114
4003733 400 200 287
4012512 396 277 250
4013886 273 221 166;
I am trying to get the lengths of each item, using the following code
set IL = setof {i in items, (i,l,w,h) in itemData} (i,l); #length of item i
This method only does not allow me to access the individual item length.
What i am trying to do is to have
display IL[271440] = 290;
how can i go about doing this?
Careful with terminology. In AMPL terms, that table isn't a "set". You have a table of parameters. Your sets are the row and column indices for that table: {"l","w","h"} for the columns, and item ID numbers for the rows.
In AMPL it would be handled something like this:
(.mod part)
set items;
set attributes := {"l","w","h"};
param itemData{items, attributes};
(.dat part)
set items :=
271440
1504858
4003733
4012512
4013886
;
param itemData: l w h :=
271440 290 214 361
1504858 394 194 114
4003733 400 200 287
4012512 396 277 250
4013886 273 221 166
;
You can then do:
ampl: display itemData[271440,"l"];
itemData[271440,'l'] = 290
I think it's possible to define set "items" at the same time as itemData and avoid the need to duplicate the ID numbers. Section 9.2 of the AMPL Book shows how to do this for a parameter that has a single index set, but I'm not sure of the syntax for doing this when you have two index sets as above. (If anybody does know, please add it!)

Multiply by a cell formula

I am calculating the next excel table in VBA and leave the results as values because of a volume of data. But then I have to multiply these range by 1 or 0 depending on a column.
The problem is that I don't want to multiply by 0 becouse I gonna lose my data and have to recalculate it (I don't want it).
So, after my macro I get a next table, for example:
var.1 var.2 var.3
0 0 0
167 92 549
159 87 621
143 95 594
124 61 463
0 0 0
5 12 75
in a Range("A2:C9").
In a Range("A1:C1") i gonna have a 1 or 0 values that will be changing so i need my Range("A2:C9") to be like:
var.1 var.2 var.3
=0*A$1 =0*B$1 =0*C$1
=167*A$1 =92*B$1 =549*C$1
...
Is it possible to do with a macro? Thank's
And I would like to get
Okay so what I would do here is first copy the original data to another sheet or set of columns so that it is always preserved. Then use this formula:
=IF($A$1 = 0, 0,E3)
Instead of writing the cell E3 reference the data that you copied.