About the project:
Hello everybody, I'm doing continuing education, with Python being a relatively large part of it. With my basic knowledge, I have practically no chance - especially for the project work, which we have to deliver on June 20, 2020. We are not allowed to use Gephi. Everything must be analyzed and derived in Python. So I am dependent on help and thank you for every support, no matter how small. I have a dataset in gexf format about the nodes "students" (student-ID) and "teachers" (teacher-ID), where each node belongs to a school class and has a corresponding gender; for teachers, no gender is given. There are 10 school classes given, e.g. 1A, 2B, up to 5A, and 5B. The edges connect the pupil-ID by means of "Origin" and "Destination". The edges are all of the type "Undirected" and weight "1". The "duration" is the duration of all interactions between the nodes (origin and destination); "count" is the number of times the origin and destination have joined together to form an interaction. I have attached the dataset to you. If someone can help me with this, I would send him/her the dataset by mail. And of course, I would pay something for the work. So far I have managed to get the system to tell me how many nodes and edges there are and what the average degree per node is. That was it. Now I wish to read out various information from this dataset with Python and the package networkx - and above all to display it graphically. This causes me many difficulties because in python I want to work with the nodes/edges in the gexf document; but also with the values per item - for example with the value of an ID of a student or with a class name. For this, I want to use "networkx". And these are my questions:
I want to implement the following task with python package networkx and a gexf file:
The focus is on these questions:
The goal must be that python calculates and shows everything that is necessary and in demand. I have to be able to run a second dataset without having to make any adjustments in the code.
The analysis should focus on three out of ten school classes:
a) 1 school class in which the average grade is highest with the other 9 school classes.
b) 1 school class in which the average grade is lowest when measured with the other 9 school classes.
c) 1 school class, the average grade of which is highest compared to other school classes with other school classes.
The following must be able to be determined:
a) Are they school classes with younger students? That means school classes between 1A, 1B, ..., 2A, 2B? Or rather older students? That means school classes 3A, 3B ..., 5A, 5B?
b) Are female or male students more active with social networks?
c) Who has the best contact with the teachers? Male? Female? Which class?
1. how can I find out from which data type a feature is? How can I convert the datatype of a feature for example from String to Int (preprocessing engineering)?
2. how can I calculate the number of edges per node (1 origin and how many targets?)? I think this is called "degree", I have seen. What does this code look like? So I want to know how many connections (edges) the node 1551 has to other nodes, for example, how many connections the student 1551 has to other students to other students (and teachers). How can I list them per node? What does the code for this look like?
For example, how can I calculate the sum of "counts" or "duration" per student ID? How can I use the result of the number of nodes per Student ID to divide them, for example, to calculate the average duration?
4. develop and display the graph for the whole dataset What does the code for this look like?
5. how can I display graphs, i.e. connections and nodes of a single school class (clusters?), single nodes (students) of the same class, etc. with different colors? What does the code for this look like?
6. subgraphs: Are they parts of a whole graph, as I understood it, or? How can I display them for example for between two, three classes, ten students per class, for students and teachers together, etc.? Or for example for the connections within a class? What does the code for this look like?
7. are subgraphs also called "subgroups" and "clusters"? What does the code for this look like so that I can graphically represent such properties? What do I concentrate on in the dataset? For example items? values?
8 How can I determine whether or which student is an "influencer" in the class and which student is not an influencer at all? And which pupil is the "influencer" between school classes? Which are the "influencers" between school classes? What does the code for this look like?
9. how can I remove items. For example, because they represent an outlier? Or simply remove all those items whose "count" is below 10, for example? What does the code for this look like?
10. weight: I have read a lot about it, but I have not been able to figure out what it is and what purpose it serves. What does this tell me? What can I do with it? How can I change this weight? Why is the weight-adjusted in different ways, i.e. for certain connections the weight is often set to 2, for others to 3, etc. And above all: Why should I change a weight, i.e. what could be the intention? What would this mean for "my" dataset? Between all origins and destinations, in our case, there is weight 1. Why should I change it?
Why should it make sense to change weights? In which case would I do this and why? What is the code for changing the weight?
11. How to calculate the following? For example, is this calculated per student or even per cluster (e.g. school class)? Or for which properties is this calculated?
- Degree Centrality
- Betweenness-Zentralität)
- Closeness Central Office
- Prestige Indegree
- Ego Network
12. Link Predictions? I heard that there are ways to use different models (or algorithms?) to predict what other possible connections between nodes or the students (in our case) might look like. For example Jaccard, Common Neighbours, Preferential Attachment, Resource Allocation, etc. What does the code for these look like?
Contact us for this machine learning assignment Solutions by Codersarts Specialist who can help you mentor and guide for such machine learning assignments.
If you have project or assignment files, You can send at contact@codersarts.com directly