In an earlier module, you created programs that read the contents of a large file and process it, writing the results into another large file (Code at end). What if the files were 10x bigger, i.e. instead of a million rows, they were 10 million rows? Which of the following methods would have the fastest processing time:Run the process as it is, with the larger files.Break the files up into 10 files and schedule processes to run 30 seconds or 1 minute apart, then combine the resulting files into a single output file.Break the files up into 2 files and schedule processes to run 30 seconds or 1 minute apart, then combine the resulting files into a single output file.Break the files up into 5 files and schedule processes to run 30 seconds or 1 minute apart, then combine the resulting files into a single output file.Break the files up into 20 files and schedule processes to run 30 seconds or 1 minute apart, then combine the resulting files into a single output file.Can you think of other ways to increase efficiency and reduce processing time?Code from previous lesson:import randomimport osimport sys#getting the datetime importfrom datetime import datetime#read the entire file into memory and printdef readFile1(filename):f = open(filename)all_lines = f.readlines()all_lines = "".join(all_lines)print(all_lines)#read the file one line at a time in memory and print itdef readFile2(filename):with open(filename) as f:for line in f:print(line)def readFile3(filename):#get file sizef_size = os.path.getsize(filename)f = open(filename)#depending upon the file size determine the half way mark in bytesif f_size % 2 == 0:read_until = int(f_size/2)else:read_until = int((f_size+1)/2)#read the first half of the file into memoryfirst_half = f.read(read_until)#print the first half that has been read into memoryall_lines = "".join(first_half)print(all_lines)print(">>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<")#read the second part into memory using file seekf.seek(read_until+1)second_half = f.read()#print the second half that has been read into memoryall_lines = "".join(second_half)print(all_lines)def main():# What time does this start at?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)# open a file named file2outfile = open('file2.txt', 'w')# produce the numbersfor count in range(1000000):#get a random numbernum = random.randint(1,1000)outfile.write(str(num) + "\n")#Close out the text fileoutfile.close()print('Data complete')# How long did it take?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)#get the filename from command line argumentfilename = 'file2.txt'# What time does this start at?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)#read file using style 1readFile1(filename)# How long did it take?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)print("-------------------------")# What time does this start at?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)#read file using style 2readFile2(filename)# How long did it take?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)print("-------------------------")# What time does this start at?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)#read file using style 3readFile3(filename)# How long did it take?now = datetime.now()current_time = now.strftime("%H:%M:%S")print("Current time = ", current_time)main()



Answer :

Other Questions