Efficiently Reorder DataFrame of Lists/Pairings

I have an efficiency question. Essentially I have a dataframe filled with lists. Each list contains a value and a string describing that value (I assumed that a list format would be the easiest way to sort pairings). I need to separately reorder the values in each row with the highest value to the left and the lowest value to the right. I have found a solution to this, but given that I am a newer programmer, I wanted to know if you believe there is a quicker way of doing this operation without iterating through the indexes. Please feel free to provide any sort of feedback that you have. The only requirement I have is that the final solution is a dataframe where a value is immediately followed by its string descriptor (the string descriptor could be in its own adjacent column, doesn't need to be in a list).

Starting DF:

import pandas as pd

import numpy as np

master_stop = pd.DataFrame([[[56,'Support'],[58, 'MA']],

                            [[24.4, 'Support'],[23.3,'MA'],[25,'MA']]],

                           ['Symbol_1','Symbol_2']).fillna(np.NaN)

master_stop



Out[2]: 

                        0           1         2

Symbol_1    [56, Support]    [58, MA]       NaN

Symbol_2  [24.4, Support]  [23.3, MA]  [25, MA]

Sorting Method That I'm Looking to Improve:

def sort_df():

    for index in master_stop.index:

        master_stop.loc[index] = master_stop.loc[index].sort_values(ascending=False).values

Sorted DF:

sort_df()

master_stop

Out[3]: 

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 19:41

asked Nov 22 '18 at 19:12

Whip

4917

add a comment |

Starting DF:

import pandas as pd

import numpy as np

master_stop = pd.DataFrame([[[56,'Support'],[58, 'MA']],

                            [[24.4, 'Support'],[23.3,'MA'],[25,'MA']]],

                           ['Symbol_1','Symbol_2']).fillna(np.NaN)

master_stop



Out[2]: 

                        0           1         2

Symbol_1    [56, Support]    [58, MA]       NaN

Symbol_2  [24.4, Support]  [23.3, MA]  [25, MA]

Sorting Method That I'm Looking to Improve:

def sort_df():

    for index in master_stop.index:

        master_stop.loc[index] = master_stop.loc[index].sort_values(ascending=False).values

Sorted DF:

sort_df()

master_stop

Out[3]: 

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 19:41

asked Nov 22 '18 at 19:12

Whip

4917

add a comment |

Starting DF:

import pandas as pd

import numpy as np

master_stop = pd.DataFrame([[[56,'Support'],[58, 'MA']],

                            [[24.4, 'Support'],[23.3,'MA'],[25,'MA']]],

                           ['Symbol_1','Symbol_2']).fillna(np.NaN)

master_stop



Out[2]: 

                        0           1         2

Symbol_1    [56, Support]    [58, MA]       NaN

Symbol_2  [24.4, Support]  [23.3, MA]  [25, MA]

Sorting Method That I'm Looking to Improve:

def sort_df():

    for index in master_stop.index:

        master_stop.loc[index] = master_stop.loc[index].sort_values(ascending=False).values

Sorted DF:

sort_df()

master_stop

Out[3]: 

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 19:41

asked Nov 22 '18 at 19:12

Whip

4917

Starting DF:

import pandas as pd

import numpy as np

master_stop = pd.DataFrame([[[56,'Support'],[58, 'MA']],

                            [[24.4, 'Support'],[23.3,'MA'],[25,'MA']]],

                           ['Symbol_1','Symbol_2']).fillna(np.NaN)

master_stop



Out[2]: 

                        0           1         2

Symbol_1    [56, Support]    [58, MA]       NaN

Symbol_2  [24.4, Support]  [23.3, MA]  [25, MA]

Sorting Method That I'm Looking to Improve:

def sort_df():

    for index in master_stop.index:

        master_stop.loc[index] = master_stop.loc[index].sort_values(ascending=False).values

Sorted DF:

sort_df()

master_stop

Out[3]: 

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

python pandas

edited Nov 22 '18 at 19:41

asked Nov 22 '18 at 19:12

Whip

4917

edited Nov 22 '18 at 19:41

asked Nov 22 '18 at 19:12

Whip

4917

edited Nov 22 '18 at 19:41

asked Nov 22 '18 at 19:12

Whip

4917

asked Nov 22 '18 at 19:12

Whip

4917

asked Nov 22 '18 at 19:12

Whip

4917

add a comment |

1 Answer
1

active

oldest

votes

Using stack, sort_values, sort_index and unstack can do the job. Not in one line but if you do

master_stack = master_stop.stack().sort_index(level=0,ascending=[True])

master_stop = (pd.Series(data = master_stack.sort_values(ascending=False).sort_index(level=0,ascending=[True]).values,

                         index = master_stack.index)

                 .unstack())

then master_stop will be sorted as expected

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 20:56

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

This solution works for the two symbols but when I run it AFTER the code below, which increases the number of instances in master stop, I get a wild unsorted DF (Sorry, don't know how to make indents on comments for code). for i in range(100): master_stop.loc[i,0] = [100,'Support'] master_stop.loc[i,1] = [102,'MA']

– Whip
Nov 22 '18 at 20:24

@Whip indeed, sorry I fixed my error by adding sort_index. See the code is edited. You can also have a look at groupby but I think using sort_index will be faster is you have a lot of rows in your original dataframe

– Ben.T
Nov 22 '18 at 20:57

1

Thank you! I can indeed confirm that your code is an improvement. My original code using an additional 100 entries ran at approx. 144ms, while yours is running at 88ms, providing a substantial improvement. Before I accept your answer, I plan on leaving the question open a bit longer in case anybody else has any unique alternative solutions.

– Whip
Nov 22 '18 at 21:27

@Whip good :) and I would guess that the gain in time will increase with the number of rows.

– Ben.T
Nov 22 '18 at 21:37

Quick follow-up question! Why did you have to put s around the second 'True' statement in the second line when you call sort_index? I notice that in the first master_stack line, the s around True didn't make a difference in output, but in the second line having the brackets around [True] makes a big difference in the output. I'm guessing its function specific since sort_values didn't require s around the False call....but i couldn't find anything in the pandas documentation.

– Whip
Nov 23 '18 at 17:28

|
show 4 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436907%2fefficiently-reorder-dataframe-of-lists-pairings%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Using stack, sort_values, sort_index and unstack can do the job. Not in one line but if you do

master_stack = master_stop.stack().sort_index(level=0,ascending=[True])

master_stop = (pd.Series(data = master_stack.sort_values(ascending=False).sort_index(level=0,ascending=[True]).values,

                         index = master_stack.index)

                 .unstack())

then master_stop will be sorted as expected

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 20:56

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

This solution works for the two symbols but when I run it AFTER the code below, which increases the number of instances in master stop, I get a wild unsorted DF (Sorry, don't know how to make indents on comments for code). for i in range(100): master_stop.loc[i,0] = [100,'Support'] master_stop.loc[i,1] = [102,'MA']

– Whip
Nov 22 '18 at 20:24

@Whip indeed, sorry I fixed my error by adding sort_index. See the code is edited. You can also have a look at groupby but I think using sort_index will be faster is you have a lot of rows in your original dataframe

– Ben.T
Nov 22 '18 at 20:57

1

Thank you! I can indeed confirm that your code is an improvement. My original code using an additional 100 entries ran at approx. 144ms, while yours is running at 88ms, providing a substantial improvement. Before I accept your answer, I plan on leaving the question open a bit longer in case anybody else has any unique alternative solutions.

– Whip
Nov 22 '18 at 21:27

@Whip good :) and I would guess that the gain in time will increase with the number of rows.

– Ben.T
Nov 22 '18 at 21:37

Quick follow-up question! Why did you have to put s around the second 'True' statement in the second line when you call sort_index? I notice that in the first master_stack line, the s around True didn't make a difference in output, but in the second line having the brackets around [True] makes a big difference in the output. I'm guessing its function specific since sort_values didn't require s around the False call....but i couldn't find anything in the pandas documentation.

– Whip
Nov 23 '18 at 17:28

|
show 4 more comments

Using stack, sort_values, sort_index and unstack can do the job. Not in one line but if you do

master_stack = master_stop.stack().sort_index(level=0,ascending=[True])

master_stop = (pd.Series(data = master_stack.sort_values(ascending=False).sort_index(level=0,ascending=[True]).values,

                         index = master_stack.index)

                 .unstack())

then master_stop will be sorted as expected

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 20:56

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

This solution works for the two symbols but when I run it AFTER the code below, which increases the number of instances in master stop, I get a wild unsorted DF (Sorry, don't know how to make indents on comments for code). for i in range(100): master_stop.loc[i,0] = [100,'Support'] master_stop.loc[i,1] = [102,'MA']

– Whip
Nov 22 '18 at 20:24

@Whip indeed, sorry I fixed my error by adding sort_index. See the code is edited. You can also have a look at groupby but I think using sort_index will be faster is you have a lot of rows in your original dataframe

– Ben.T
Nov 22 '18 at 20:57

1

Thank you! I can indeed confirm that your code is an improvement. My original code using an additional 100 entries ran at approx. 144ms, while yours is running at 88ms, providing a substantial improvement. Before I accept your answer, I plan on leaving the question open a bit longer in case anybody else has any unique alternative solutions.

– Whip
Nov 22 '18 at 21:27

@Whip good :) and I would guess that the gain in time will increase with the number of rows.

– Ben.T
Nov 22 '18 at 21:37

Quick follow-up question! Why did you have to put s around the second 'True' statement in the second line when you call sort_index? I notice that in the first master_stack line, the s around True didn't make a difference in output, but in the second line having the brackets around [True] makes a big difference in the output. I'm guessing its function specific since sort_values didn't require s around the False call....but i couldn't find anything in the pandas documentation.

– Whip
Nov 23 '18 at 17:28

|
show 4 more comments

Using stack, sort_values, sort_index and unstack can do the job. Not in one line but if you do

master_stack = master_stop.stack().sort_index(level=0,ascending=[True])

master_stop = (pd.Series(data = master_stack.sort_values(ascending=False).sort_index(level=0,ascending=[True]).values,

                         index = master_stack.index)

                 .unstack())

then master_stop will be sorted as expected

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 20:56

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

Using stack, sort_values, sort_index and unstack can do the job. Not in one line but if you do

master_stack = master_stop.stack().sort_index(level=0,ascending=[True])

master_stop = (pd.Series(data = master_stack.sort_values(ascending=False).sort_index(level=0,ascending=[True]).values,

                         index = master_stack.index)

                 .unstack())

then master_stop will be sorted as expected

                 0                1           2

Symbol_1  [58, MA]    [56, Support]         NaN

Symbol_2  [25, MA]  [24.4, Support]  [23.3, MA]

edited Nov 22 '18 at 20:56

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

edited Nov 22 '18 at 20:56

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

answered Nov 22 '18 at 19:55

Ben.T

6,0272725

This solution works for the two symbols but when I run it AFTER the code below, which increases the number of instances in master stop, I get a wild unsorted DF (Sorry, don't know how to make indents on comments for code). for i in range(100): master_stop.loc[i,0] = [100,'Support'] master_stop.loc[i,1] = [102,'MA']

– Whip
Nov 22 '18 at 20:24

@Whip indeed, sorry I fixed my error by adding sort_index. See the code is edited. You can also have a look at groupby but I think using sort_index will be faster is you have a lot of rows in your original dataframe

– Ben.T
Nov 22 '18 at 20:57

1

Thank you! I can indeed confirm that your code is an improvement. My original code using an additional 100 entries ran at approx. 144ms, while yours is running at 88ms, providing a substantial improvement. Before I accept your answer, I plan on leaving the question open a bit longer in case anybody else has any unique alternative solutions.

– Whip
Nov 22 '18 at 21:27

@Whip good :) and I would guess that the gain in time will increase with the number of rows.

– Ben.T
Nov 22 '18 at 21:37

Quick follow-up question! Why did you have to put s around the second 'True' statement in the second line when you call sort_index? I notice that in the first master_stack line, the s around True didn't make a difference in output, but in the second line having the brackets around [True] makes a big difference in the output. I'm guessing its function specific since sort_values didn't require s around the False call....but i couldn't find anything in the pandas documentation.

– Whip
Nov 23 '18 at 17:28

|
show 4 more comments

This solution works for the two symbols but when I run it AFTER the code below, which increases the number of instances in master stop, I get a wild unsorted DF (Sorry, don't know how to make indents on comments for code). for i in range(100): master_stop.loc[i,0] = [100,'Support'] master_stop.loc[i,1] = [102,'MA']

– Whip
Nov 22 '18 at 20:24

@Whip indeed, sorry I fixed my error by adding sort_index. See the code is edited. You can also have a look at groupby but I think using sort_index will be faster is you have a lot of rows in your original dataframe

– Ben.T
Nov 22 '18 at 20:57

1

Thank you! I can indeed confirm that your code is an improvement. My original code using an additional 100 entries ran at approx. 144ms, while yours is running at 88ms, providing a substantial improvement. Before I accept your answer, I plan on leaving the question open a bit longer in case anybody else has any unique alternative solutions.

– Whip
Nov 22 '18 at 21:27

@Whip good :) and I would guess that the gain in time will increase with the number of rows.

– Ben.T
Nov 22 '18 at 21:37

Quick follow-up question! Why did you have to put s around the second 'True' statement in the second line when you call sort_index? I notice that in the first master_stack line, the s around True didn't make a difference in output, but in the second line having the brackets around [True] makes a big difference in the output. I'm guessing its function specific since sort_values didn't require s around the False call....but i couldn't find anything in the pandas documentation.

– Whip
Nov 23 '18 at 17:28

This solution works for the two symbols but when I run it AFTER the code below, which increases the number of instances in master stop, I get a wild unsorted DF (Sorry, don't know how to make indents on comments for code). for i in range(100): master_stop.loc[i,0] = [100,'Support'] master_stop.loc[i,1] = [102,'MA']

– Whip
Nov 22 '18 at 20:24

@Whip indeed, sorry I fixed my error by adding sort_index. See the code is edited. You can also have a look at groupby but I think using sort_index will be faster is you have a lot of rows in your original dataframe

– Ben.T
Nov 22 '18 at 20:57

Thank you! I can indeed confirm that your code is an improvement. My original code using an additional 100 entries ran at approx. 144ms, while yours is running at 88ms, providing a substantial improvement. Before I accept your answer, I plan on leaving the question open a bit longer in case anybody else has any unique alternative solutions.

– Whip
Nov 22 '18 at 21:27

@Whip good :) and I would guess that the gain in time will increase with the number of rows.

– Ben.T
Nov 22 '18 at 21:37

Quick follow-up question! Why did you have to put s around the second 'True' statement in the second line when you call sort_index? I notice that in the first master_stack line, the s around True didn't make a difference in output, but in the second line having the brackets around [True] makes a big difference in the output. I'm guessing its function specific since sort_values didn't require s around the False call....but i couldn't find anything in the pandas documentation.

– Whip
Nov 23 '18 at 17:28

|
show 4 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Eg,E T0bqyp 004 xBSN0 6oV dFPrNWWrtA2fVf 1,uM teVJN27RO9B2e,R

搜尋此網誌

Tukukkk