InsideDarkWeb.com

Automating a set a of weekly reports, including graphs and delivery of reports

I have been writing code to automate some weekly reports. I had help over on Stack Overflow. I have code that works for the most part, however there are a few things that I just can’t seem to fix.

In short, I loop through the data and create a dictionary of dataframes based on ‘location’ key unique values. I can use the dictionary to make summary reports for each location. I wanted to make another dictionary from this based on ‘sublocation.’ Instead with some advice, I make a list of each sublocation, access each item in the df-dict, loop to find corresponding sublocations and make plots.

My problems are as follows:

  1. Code is slow
  2. Graphs are not formatted properly (overlapping even with tight_layout)
  3. For the reports in sublocation, I am having a hard time saving to the right folder. I think this has to do with the way I want to format the string in the savefig text. For each sublocation I want reference the name using value[‘location’], I think this is always updated every loop so it doesn’t work.
  4. I have the error exception because when looking to match subloc to loc. not every subloc will appear in the dict value dataframe
f = 'path'
d = pd.DataFrame()
d= pd.read_csv(f)
dfs = dict(tuple(d.groupby('location')))
for key, value in dfs.items():
    try:
        fig, axs = plt.subplots(2, 3);
        sns.countplot(y='ethnic', data=value, orient='h', palette ='colorblind', ax=axs[0,0]);
        sns.countplot(y='Ratio', data=value,orient='v', palette ='colorblind',ax=axs[1,0]);
        sns.countplot(y='site', data = value, ax=axs[0,1]);
        sns.countplot(y='STATUS', data = value, ax = axs[1,1])
        sns.countplot(y='Assessment', data = value, ax = axs[0,2])
        #pth = os.path.join(tmppath, '{0}'.format(key))
        for p in axs.patches:
            ax.text(p.get_x() + p.get_width()/2., p.get_width(), '%d' % int(p.get_width()), 
            fontsize=12, color='red', ha='center', va='bottom')
        plt.tight_layout(pad=2.0, w_pad=1.0, h_pad=2.0);
        plt.set_title('{0}'.format(key)+'Summary')
        plt.savefig("basepath/testing123/{0}/{1}.pdf".format(key,key), bbox_inches = 'tight'); 
        plt.clf()

        #plt.show()
    except:
        plt.savefig("basepath/{0}/{1}.pdf".format(key,key), bbox_inches = 'tight');
        #plt.savefig("{0}.pdf".format(key), bbox_inches = 'tight'); 
        pass

#####Now for sublocations

dfss = dict(tuple(d.groupby('site')))

#%%

for key, value in dfss.items():
    a =(repr(value['school_dbn'][:1]))

    try:
        fig, axs = plt.subplots(2, 3);
        #tmppath = 'basepath/{0}'.format(key);
        sns.countplot(y='ethnic', data=value, orient='h', palette ='colorblind', ax=axs[0,0]);
        sns.countplot(y='Program]', data=value,orient='v', palette ='colorblind',ax=axs[1,0]);
        sns.countplot(y='AltAssessment', data = value, ax = axs[0,2])
        pth = os.path.join(tmppath, '{0}'.format(key))
        plt.tight_layout(pad=2.0, w_pad=1.0, h_pad=2.0);
        plt.set_title('{0}'.format(key)+'Summary')
        plt.savefig("basepath/{0}/{1}_{2}.pdf".format(value['location'][-6:],value['location'][-6:],key), bbox_inches = 'tight'); 
        plt.clf()

        #plt.show()
    except:
        plt.savefig("basepath/testing123/{0}/{1}_{2}.pdf".format(value['location'][-6:],value['location'][-6:],key), bbox_inches = 'tight');
        #plt.savefig("{0}.pdf".format(key), bbox_inches = 'tight'); 
        pass

The reason why I want to save like this is because each location has a folder with same name. Sublocation belongs to only one location, therefore I want to save as ‘location_sublocation.pdf’.

Code Review Asked by Moo10000 on November 11, 2021

1 Answers

One Answer

I got this done by making a second dictionary, which takes locations as keys and values as list of sublocations

dfs = dict(tuple(data.groupby('location')))
dfss = dict(tuple(data.groupby('sublocation')))

dd = {}

for key, value in dfs.items(): #dictionary is made of groupby object, key is 
                               #location, value is datafram
    a = []
    dee={}
    for i in value['sublocation']:
        if i in a:
            pass
        else:
            a.append(str(i))
    dee = {key:a}
    dd.update(dee)
for key, value in dfss.items(): 
    try:
        for k, v in dd.items():
            if key in v:
                dur=str(k)
            else:
                pass
    except:
        pass

Then in the next cell,

for key, value in dfss.items(): 
    try:
        for k, v in dd.items():
            if key in v:
                dur=str(k)
            else:
                pass
        #tmp = value[value['sublocation']==i]
        sns.set(style='white', palette=sns.palplot(sns.color_palette(ui)), font='sans-serif')

I think I can make the overall script run even faster by employing more regex expressions for filtering the dataframe in various steps.

This set-up works because I can save the files according to the key's from the two dictionaries. It allows me to save the nearly 375 files automatically. I use another script to move the files to their respective folders.

plt.savefig("path/{0}/{1} @ {2}.pdf".format(dur,dur,key), bbox_inches = 'tight')

Having a slightly different case, take three data sets and make mini data sets based on some column such as location

oct_dict = dict(tuple(oct.groupby('location')))
oct2_dict = dict(tuple(oct2.groupby('location'))) 
for k, v in oct_dict.items():
    #try:
        #v2 = stu_dict[k]      #sometimes using this try/else method works better
    #else:
        #v2 = pd.DataFrame()
    #try:
        #v3 = oct2_dict[k]
    #else:
        #v3 = pd.DataFrame()
    for k2, v2 in stu_dict.items(): #replace with v2 = stu_dict[k] if you know for sure it exits
        for k3, v3 in oct2_dict.items(): #replace with v3 = oct2_dict[k] if you know for sure it exits
            if k == k2 and k == k3: #can delete this if not needed
                plt.close('all')
                with PdfPages(r'path{}.pdf'.format(k)) as pdf:

Answered by Moo10000 on November 11, 2021

Add your own answers!

Related Questions

Check if array has the same number of even and odd values in Python

11  Asked on February 26, 2021 by uncalled-astronomer

   

Transpose of a matrix using Python 3.8

2  Asked on February 25, 2021

   

Angular Typescript Async queue service

0  Asked on February 22, 2021 by leonel-franchelli

         

A simple terminal-based trading game in C

2  Asked on February 20, 2021 by redwolf-programs

   

Python wrapper for official Hacker News API

1  Asked on February 15, 2021

         

Implementing a Directed and Undirected Graph in Java

1  Asked on February 14, 2021 by msmilkshake

   

c++ shell for linux

2  Asked on February 14, 2021 by the-masked-rebel

     

Excel blank row inserter

2  Asked on February 12, 2021 by sandro4912

       

Private VBA Class Initializer called from Factory #2

0  Asked on February 8, 2021 by cristian-buse

     

Perl – Splitting a string

3  Asked on February 7, 2021 by linny

     

Hackerrank’s Queen’s Attack II

2  Asked on February 6, 2021 by bork

     

C++17 thread pool

2  Asked on February 6, 2021 by osuka_

         

Finding duplicates in multiple lists for configuration validation

2  Asked on February 1, 2021 by arkady-levin

     

Ask a Question

Get help from others!

© 2021 InsideDarkWeb.com. All rights reserved.