How to deal with a dataset with “periods of time” and missing data












1















I'm working on a dataset, which has as columns points in time (e.g. August, September, etc.) and as rows different measurements which were collected at that point.

Apart from that, the data is not clean at all, the are a lot of missing data and I just can't drop all the rows with them or filling them up so my idea was to divide the dataset in 4 smaller ones.

What kind of analysis can be performed on a dataset of this kind? Should I invert columns and rows?










share|improve this question

























  • You could invert the columns/rows and then perform time series imputation with a R package like imputeTS. If this actually makes sense depends a lot on your dataset.

    – stats0007
    Nov 27 '18 at 17:11











  • I have very few observations and the dataset is made of satisfaction data by consumers. I have some doubts, but do you think it would be a good idea?

    – Zhang_anlan
    Nov 27 '18 at 17:27
















1















I'm working on a dataset, which has as columns points in time (e.g. August, September, etc.) and as rows different measurements which were collected at that point.

Apart from that, the data is not clean at all, the are a lot of missing data and I just can't drop all the rows with them or filling them up so my idea was to divide the dataset in 4 smaller ones.

What kind of analysis can be performed on a dataset of this kind? Should I invert columns and rows?










share|improve this question

























  • You could invert the columns/rows and then perform time series imputation with a R package like imputeTS. If this actually makes sense depends a lot on your dataset.

    – stats0007
    Nov 27 '18 at 17:11











  • I have very few observations and the dataset is made of satisfaction data by consumers. I have some doubts, but do you think it would be a good idea?

    – Zhang_anlan
    Nov 27 '18 at 17:27














1












1








1








I'm working on a dataset, which has as columns points in time (e.g. August, September, etc.) and as rows different measurements which were collected at that point.

Apart from that, the data is not clean at all, the are a lot of missing data and I just can't drop all the rows with them or filling them up so my idea was to divide the dataset in 4 smaller ones.

What kind of analysis can be performed on a dataset of this kind? Should I invert columns and rows?










share|improve this question
















I'm working on a dataset, which has as columns points in time (e.g. August, September, etc.) and as rows different measurements which were collected at that point.

Apart from that, the data is not clean at all, the are a lot of missing data and I just can't drop all the rows with them or filling them up so my idea was to divide the dataset in 4 smaller ones.

What kind of analysis can be performed on a dataset of this kind? Should I invert columns and rows?







dataset regression cluster-computing data-analysis missing-data






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 '18 at 8:14







Zhang_anlan

















asked Nov 23 '18 at 7:59









Zhang_anlanZhang_anlan

347




347













  • You could invert the columns/rows and then perform time series imputation with a R package like imputeTS. If this actually makes sense depends a lot on your dataset.

    – stats0007
    Nov 27 '18 at 17:11











  • I have very few observations and the dataset is made of satisfaction data by consumers. I have some doubts, but do you think it would be a good idea?

    – Zhang_anlan
    Nov 27 '18 at 17:27



















  • You could invert the columns/rows and then perform time series imputation with a R package like imputeTS. If this actually makes sense depends a lot on your dataset.

    – stats0007
    Nov 27 '18 at 17:11











  • I have very few observations and the dataset is made of satisfaction data by consumers. I have some doubts, but do you think it would be a good idea?

    – Zhang_anlan
    Nov 27 '18 at 17:27

















You could invert the columns/rows and then perform time series imputation with a R package like imputeTS. If this actually makes sense depends a lot on your dataset.

– stats0007
Nov 27 '18 at 17:11





You could invert the columns/rows and then perform time series imputation with a R package like imputeTS. If this actually makes sense depends a lot on your dataset.

– stats0007
Nov 27 '18 at 17:11













I have very few observations and the dataset is made of satisfaction data by consumers. I have some doubts, but do you think it would be a good idea?

– Zhang_anlan
Nov 27 '18 at 17:27





I have very few observations and the dataset is made of satisfaction data by consumers. I have some doubts, but do you think it would be a good idea?

– Zhang_anlan
Nov 27 '18 at 17:27












1 Answer
1






active

oldest

votes


















1














A timeseries regression with missing data is a special case within statistical analysis. Simply re-jigging the data set is not the solution.



I understand periodicity analysis and spectral analysis is performed to identify the sinosoid of best fit, i.e. a sine wave is driven through the missing data points and regression is one approach in identifying the fit to the existing data.



The same question has been previously raised on Stats exchange based on ARIMA (moving average). Personally, I am not overawed by this approach because there will be a specialist solution.
https://stats.stackexchange.com/questions/121414/how-do-i-handle-nonexistent-or-missing-data






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53442692%2fhow-to-deal-with-a-dataset-with-periods-of-time-and-missing-data%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    A timeseries regression with missing data is a special case within statistical analysis. Simply re-jigging the data set is not the solution.



    I understand periodicity analysis and spectral analysis is performed to identify the sinosoid of best fit, i.e. a sine wave is driven through the missing data points and regression is one approach in identifying the fit to the existing data.



    The same question has been previously raised on Stats exchange based on ARIMA (moving average). Personally, I am not overawed by this approach because there will be a specialist solution.
    https://stats.stackexchange.com/questions/121414/how-do-i-handle-nonexistent-or-missing-data






    share|improve this answer




























      1














      A timeseries regression with missing data is a special case within statistical analysis. Simply re-jigging the data set is not the solution.



      I understand periodicity analysis and spectral analysis is performed to identify the sinosoid of best fit, i.e. a sine wave is driven through the missing data points and regression is one approach in identifying the fit to the existing data.



      The same question has been previously raised on Stats exchange based on ARIMA (moving average). Personally, I am not overawed by this approach because there will be a specialist solution.
      https://stats.stackexchange.com/questions/121414/how-do-i-handle-nonexistent-or-missing-data






      share|improve this answer


























        1












        1








        1







        A timeseries regression with missing data is a special case within statistical analysis. Simply re-jigging the data set is not the solution.



        I understand periodicity analysis and spectral analysis is performed to identify the sinosoid of best fit, i.e. a sine wave is driven through the missing data points and regression is one approach in identifying the fit to the existing data.



        The same question has been previously raised on Stats exchange based on ARIMA (moving average). Personally, I am not overawed by this approach because there will be a specialist solution.
        https://stats.stackexchange.com/questions/121414/how-do-i-handle-nonexistent-or-missing-data






        share|improve this answer













        A timeseries regression with missing data is a special case within statistical analysis. Simply re-jigging the data set is not the solution.



        I understand periodicity analysis and spectral analysis is performed to identify the sinosoid of best fit, i.e. a sine wave is driven through the missing data points and regression is one approach in identifying the fit to the existing data.



        The same question has been previously raised on Stats exchange based on ARIMA (moving average). Personally, I am not overawed by this approach because there will be a specialist solution.
        https://stats.stackexchange.com/questions/121414/how-do-i-handle-nonexistent-or-missing-data







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 23 '18 at 8:28









        Michael G.Michael G.

        2221316




        2221316






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53442692%2fhow-to-deal-with-a-dataset-with-periods-of-time-and-missing-data%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'