以下是我的数据帧示例:
+--------------+-------+-------------+--------------+----------+-----------+
| ID | Part | RequestFrom | QTYRequested | Location | QTYOnHand |
+--------------+-------+-------------+--------------+----------+-----------+
| PartACity 1 | PartA | City 1 | 1 | LocA | 2 |
| PartACity 2 | PartA | City 2 | 1 | LocA | 2 |
| PartACity 3 | PartA | City 3 | 1 | LocA | 2 |
| PartACity 4 | PartA | City 4 | 1 | LocA | 2 |
| PartACity 5 | PartA | City 5 | 1 | LocA | 2 |
| PartACity 6 | PartA | City 6 | 1 | LocA | 2 |
| PartACity 7 | PartA | City 7 | 1 | LocA | 2 |
| PartACity 8 | PartA | City 8 | 1 | LocA | 2 |
| PartACity 9 | PartA | City 9 | 1 | LocA | 2 |
| PartACity 10 | PartA | City 10 | 1 | LocA | 2 |
| PartACity 1 | PartA | City 1 | 1 | LocB | 3 |
| PartACity 2 | PartA | City 2 | 1 | LocB | 3 |
| PartACity 3 | PartA | City 3 | 1 | LocB | 3 |
| PartACity 4 | PartA | City 4 | 1 | LocB | 3 |
| PartACity 5 | PartA | City 5 | 1 | LocB | 3 |
| PartACity 6 | PartA | City 6 | 1 | LocB | 3 |
| PartACity 7 | PartA | City 7 | 1 | LocB | 3 |
| PartACity 8 | PartA | City 8 | 1 | LocB | 3 |
| PartACity 9 | PartA | City 9 | 1 | LocB | 3 |
| PartACity 10 | PartA | City 10 | 1 | LocB | 3 |
| PartACity 1 | PartA | City 1 | 1 | LocC | 4 |
| PartACity 2 | PartA | City 2 | 1 | LocC | 4 |
| PartACity 3 | PartA | City 3 | 1 | LocC | 4 |
| PartACity 4 | PartA | City 4 | 1 | LocC | 4 |
| PartACity 5 | PartA | City 5 | 1 | LocC | 4 |
| PartACity 6 | PartA | City 6 | 1 | LocC | 4 |
| PartACity 7 | PartA | City 7 | 1 | LocC | 4 |
| PartACity 8 | PartA | City 8 | 1 | LocC | 4 |
| PartACity 9 | PartA | City 9 | 1 | LocC | 4 |
| PartACity 10 | PartA | City 10 | 1 | LocC | 4 |
+--------------+-------+-------------+--------------+----------+-----------+
我想将上面的数据框转换为:
+-------------+-------+-------------+--------------+----------+-----------+
| ID | Part | RequestFrom | QTYRequested | Location | QTYOnHand |
+-------------+-------+-------------+--------------+----------+-----------+
| PartACity 1 | PartA | City 1 | 1 | LocA | 2 |
| PartACity 2 | PartA | City 2 | 1 | LocA | 2 |
| PartACity 3 | PartA | City 3 | 1 | LocB | 3 |
| PartACity 4 | PartA | City 4 | 1 | LocB | 3 |
| PartACity 5 | PartA | City 5 | 1 | LocB | 3 |
| PartACity 6 | PartA | City 6 | 1 | LocC | 4 |
| PartACity 7 | PartA | City 7 | 1 | LocC | 4 |
| PartACity 8 | PartA | City 8 | 1 | LocC | 4 |
| PartACity 9 | PartA | City 9 | 1 | LocC | 4 |
+-------------+-------+-------------+--------------+----------+-----------+
如您所见,总QTYOnHand为9,但我们对A部分有10个开放请求.
我想找到一种更好的方式来分配数量.
由于LocA只有两个数量的PartA,所以我们只保留前两行.
LocB有3个PartA数量,接下来的3个数量将分配给LocB.
LocC有4个PartA数量,接下来的4个数量将分配给LocC.
任何帮助将不胜感激!!!
最佳答案
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import pandas as pd
>>> df = pd.DataFrame({
'ID' : ['PartACity 1', 'PartACity 2', 'PartACity 3', 'PartACity 4', 'PartACity 5', 'PartACity 6', 'PartACity 7', 'PartACity 8', 'PartACity 9', 'PartACity 10', 'PartACity 1', 'PartACity 2', 'PartACity 3', 'PartACity 4', 'PartACity 5', 'PartACity 6', 'PartACity 7', 'PartACity 8', 'PartACity 9', 'PartACity 10', 'PartACity 1', 'PartACity 2', 'PartACity 3', 'PartACity 4', 'PartACity 5', 'PartACity 6', 'PartACity 7', 'PartACity 8', 'PartACity 9', 'PartACity 10'],
'Part' : ['PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA', 'PartA'],
'RequestFrom': ['City 1', 'City 2', 'City 3', 'City 4', 'City 5', 'City 6', 'City 7', 'City 8', 'City 9', 'City 10', 'City 1', 'City 2', 'City 3', 'City 4', 'City 5', 'City 6', 'City 7', 'City 8', 'City 9', 'City 10', 'City 1', 'City 2', 'City 3', 'City 4', 'City 5', 'City 6', 'City 7', 'City 8', 'City 9', 'City 10'],
'QTYRequested': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
'Location': ['LocA', 'LocA', 'LocA', 'LocA', 'LocA', 'LocA', 'LocA', 'LocA', 'LocA', 'LocA', 'LocB', 'LocB', 'LocB', 'LocB', 'LocB', 'LocB', 'LocB', 'LocB', 'LocB', 'LocB', 'LocC', 'LocC', 'LocC', 'LocC', 'LocC', 'LocC', 'LocC', 'LocC', 'LocC', 'LocC'],
'QTYOnHand': [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
})
>>> print(df)
ID Location ... QTYRequested RequestFrom
0 PartACity 1 LocA ... 1 City 1
1 PartACity 2 LocA ... 1 City 2
2 PartACity 3 LocA ... 1 City 3
3 PartACity 4 LocA ... 1 City 4
4 PartACity 5 LocA ... 1 City 5
5 PartACity 6 LocA ... 1 City 6
6 PartACity 7 LocA ... 1 City 7
7 PartACity 8 LocA ... 1 City 8
8 PartACity 9 LocA ... 1 City 9
9 PartACity 10 LocA ... 1 City 10
10 PartACity 1 LocB ... 1 City 1
11 PartACity 2 LocB ... 1 City 2
12 PartACity 3 LocB ... 1 City 3
13 PartACity 4 LocB ... 1 City 4
14 PartACity 5 LocB ... 1 City 5
15 PartACity 6 LocB ... 1 City 6
16 PartACity 7 LocB ... 1 City 7
17 PartACity 8 LocB ... 1 City 8
18 PartACity 9 LocB ... 1 City 9
19 PartACity 10 LocB ... 1 City 10
20 PartACity 1 LocC ... 1 City 1
21 PartACity 2 LocC ... 1 City 2
22 PartACity 3 LocC ... 1 City 3
23 PartACity 4 LocC ... 1 City 4
24 PartACity 5 LocC ... 1 City 5
25 PartACity 6 LocC ... 1 City 6
26 PartACity 7 LocC ... 1 City 7
27 PartACity 8 LocC ... 1 City 8
28 PartACity 9 LocC ... 1 City 9
29 PartACity 10 LocC ... 1 City 10
[30 rows x 6 columns]
Duplicate
df
astemp_df
to aggregate the quantity on hand and keep track of the quantity left for each location by creating a new columnQTYLeft
:
>>> temp_df = df
>>> temp_df = temp_df.groupby('Location').agg({'QTYOnHand':'first'})
>>> temp_df = temp_df.reset_index()
>>> temp_df['QTYLeft'] =temp_df['QTYOnHand']
>>> print(temp_df)
Location QTYOnHand QTYLeft
0 LocA 2 2
1 LocB 3 3
2 LocC 4 4
Group
df
byID
,Part
,RequestFrom
:
>>> df = df.groupby(['ID', 'Part', 'RequestFrom']).first()
>>> df = df.reset_index()
>>> print(df)
ID Part ... QTYOnHand QTYRequested
0 PartACity 1 PartA ... 2 1
1 PartACity 10 PartA ... 2 1
2 PartACity 2 PartA ... 2 1
3 PartACity 3 PartA ... 2 1
4 PartACity 4 PartA ... 2 1
5 PartACity 5 PartA ... 2 1
6 PartACity 6 PartA ... 2 1
7 PartACity 7 PartA ... 2 1
8 PartACity 8 PartA ... 2 1
9 PartACity 9 PartA ... 2 1
[10 rows x 6 columns]
Values in
ID
column are strings and thus cannot be used as an index to sort according to ascending numbers, thus we create a new temporary index calledtemp_index
first, sort thedf
in ascending order, then remove said index:
>>> df = df.assign(temp_index=[int(float(i.split(' ')[-1])) for i in df['ID']])
>>> df = df.sort_values(by='temp_index')
>>> print(df)
ID Part ... QTYRequested temp_index
0 PartACity 1 PartA ... 1 1
2 PartACity 2 PartA ... 1 2
3 PartACity 3 PartA ... 1 3
4 PartACity 4 PartA ... 1 4
5 PartACity 5 PartA ... 1 5
6 PartACity 6 PartA ... 1 6
7 PartACity 7 PartA ... 1 7
8 PartACity 8 PartA ... 1 8
9 PartACity 9 PartA ... 1 9
1 PartACity 10 PartA ... 1 10
[10 rows x 7 columns]
>>> del df['temp_index']
Create a new user-defined function (UDF) and apply it to allocate the available quantity per location, with the smaller indexes being allocated first as per your question:
>>> def allocate_qty(row):
global temp_df
try:
temp_df = temp_df[(temp_df['QTYLeft'] != 0)]
avail_qty = temp_df['QTYOnHand'].values[0]
avail_location = temp_df['Location'].values[0]
temp_df['QTYLeft'].values[0] = temp_df['QTYLeft'].values[0] - row['QTYRequested']
return avail_location, avail_qty
except:
return 'Not Allocated', 0
>>> df['Location'], df['QTYOnHand'] = zip(*df.apply(allocate_qty, axis=1))
>>> print(df)
ID Part ... QTYOnHand QTYRequested
0 PartACity 1 PartA ... 2 1
2 PartACity 2 PartA ... 2 1
3 PartACity 3 PartA ... 3 1
4 PartACity 4 PartA ... 3 1
5 PartACity 5 PartA ... 3 1
6 PartACity 6 PartA ... 4 1
7 PartACity 7 PartA ... 4 1
8 PartACity 8 PartA ... 4 1
9 PartACity 9 PartA ... 4 1
1 PartACity 10 PartA ... 0 1
[10 rows x 6 columns]
Filter out rows which did not manage to be allocated the resources:
>>> df = df[(df['Location'] != 'Not Allocated')]
>>> print(df)
ID Part ... QTYOnHand QTYRequested
0 PartACity 1 PartA ... 2 1
2 PartACity 2 PartA ... 2 1
3 PartACity 3 PartA ... 3 1
4 PartACity 4 PartA ... 3 1
5 PartACity 5 PartA ... 3 1
6 PartACity 6 PartA ... 4 1
7 PartACity 7 PartA ... 4 1
8 PartACity 8 PartA ... 4 1
9 PartACity 9 PartA ... 4 1
[9 rows x 6 columns]
希望这可以帮助!