CSE-6242 - Data & Visual Analytics
DVA |
Toggle to Select Spcific Semesters
Reviews
The course videos are useless. There is no video office hour, all office hours questions are handled in a Slack group. The homeworks are pathetic and you will learn things that are useless in the industry, for example D3. The only thing that counts (if you’re lucky) is being in a good team for the course project, but other than that this course is a huge waste of money.
This class ended up being significantly less work than I anticipated. For reference, it was my 7th class in OMSA and I have no programming experience before starting this program. During the semester, I felt that I was only actively in class for maybe half of the time. If you have moderate to advanced Python programming skills (which you likely do if you take this later in the OMSA degree), then it is not particularly challenging. Assignments are primarily busy work and most can be completed in a single week/weekend. Since assignments are on three-week timelines, you will then have nothing to do for two weeks. The project is also due one or two week before the end of the semester, so there is generally a lot of time during the semester where you won’t actively be doing anything.
A couple notes on the project - Get a good team for the project and understand that the project grading is quite lenient. Pick an achievable topic and limit your scope to that specified in the project description. Also, the midterm deliverable for the project is ~60% of the final report, so if you just follow the schedule for the project and reporting, its a very reasonable workload.
Overall, my opinion of the class is neutral. I don’t feel that the coursework does a good job teaching you much that you shouldn’t already know by this point in the program. However, really anything to keep my coding sharp is helpful. If you are a strong programmer, then I think it is safe to pair this with another course.
As someone who has elementary python skills, this is a hard class. There are 4 homework assignments with 2-3 weeks given to complete, so for people like me, you have time to self-study to figure out how to do the assignment and ensure that the answers have been correctly interpreted by the auto-grader. You could also compare the intensity of homework assignments to being like in the “real world work situations”, which is not fun when you’re also working full time or already know this experience.
However, the balance of the homework assignments to the project is to your benefit at 50/50 for your grade. Choose a good team (ideally 4-5 people) and meet regularly. It helps to select a topic that is of interest and have someone serve the role of “PM” to keep everyone organized.
The class exposed me to new technologies, research, and articles. I took the “L” by failing the D3 homework assignment, but it didn’t break my grade, so try to not stress if you’re having a bad/busy couple of weeks and an assignment or two has to fall through.
It’s four assignments and one group project. The assignments are time consuming but not difficult. The group project requires staying on top of all the deliverables. There is a lot of work but the work is not hard. The recorded lectures were largely irrelevant but there are bonus quizzes that cover the material on the lectures. The main benefit of this course is that it is very broad and not deep and will pad your resume with skills like D3, Tableau, EMR/AWS, Pandas, and PySpark.
I have a relatively strong coding background so I didn’t find this course especially challenging (but see why others with less experience may have). The homework assignments still took a pretty significant amount of time, but it was helpful that there were only 4 of them. While it was intimidating to be introduced to so much new software in such a short amount of time, in the end it was extremely valuable and reflective of a fast-paced work environment that I find myself in currently. I learned so much and actually enjoyed the group project in the end. If you’re willing to put in the work and learn on your feet, this class is totally worth it.
Also, I never once had an issue with the auto-grader. It was actually one of my favorite parts of the class; submit as many times as you need to until you know you have 100%.
BLUF: A wide breadth of learning on data visualization apps (JS, Tableau, OpenRefine) and machine learning software (Python, Docker, Databricks, AWS, Google Cloud).
Background: 1st year OMSA student with knowledge of machine learning topics, some Python, and some SQL. No JS or D3 experience. I will be receiving an A (94%), but only because I left some parts of each homework blank - the juice wasn’t worth the squeeze. Working full-time and did just this one class this semester.
The Bad: Homeworks were time-consuming, lectures were not useful to course completion, and the Piazza was a mess (although the TAs really tried). Expect to google and ask peers for help a lot.
The Good: Exposure (and learning) of A TON of industry standard software, interesting and varied homework problems. All parts of the project are very leniently graded, as long as your group can write decently well.
Project: I lucked out and had a pretty communicative and knowledgeable group. If you’re an early OMSA student, I’d recommend finding one or two folks with machine learning experience to help build the your algorithm. The data viz portion we did in Tableau and that was easy.
Overall this course did consume a lot of my time via the Homeworks, but that’s because I’m newer to Python and Javascript. One groupmate (OMSCS) finished each homework over one weekend, but it took me (OMSA) the full 3 weeks. I enjoyed tinkering with the many different tools that were provided by this course, and believe that putting on my resume helped me land m first data scientist job. It’s a wide breadth but no depth sort of class.
I’m an OMSA student with a strong Python background. Due to my previous programming experience this course was a breeze. I can certainly see why some students have a rough time with this course: if you aren’t a strong developer some of the homeworks would require 20+ hour weeks just debugging. There are many complaints about gradescope and homework instructions being a bit of a mess: I actually think it’s pretty impressive that the TAs and prof can put together these assignments the way they do and evaluate them at scale. I’m honestly surprised that issues don’t come up more frequently.
Take a D3 tutorial before HW2, don’t watch the async videos they’re useless, and make sure you have a good group for the group project.
Grade : A (I got 100+) Lectures: They are useless. You are better off watching googling the topics and reading them yourself. Background : Rudimentary Python and SQL
What I liked about this course:
- You learn a lot of new skills
What I did not like about this course:
- The lectures were not useful at all.
- Group Project: I am an OMSCS student and this was my 4th course. I formed a group with 3 other OMSA students who are have finished at least 75% of their program. During the course of the group project, I collaborated with them. I listened their ideas and I helped create a design plan even though they ultimately went with a project idea that I didn’t really want to do. I wrote almost all of the proposal and progress report. I coded over 90% of the project. I understood my teammates may be a bit new to Python. But the excuse was a little lame… they said “You’re from OMSCS so you’re better. That’s why you’re doing most of the work”. The code my teammates produced were riddled with bugs and needed to be optimized in terms of speed and memory. I helped them do that. Anyways, towards the end, my teammates became increasingly disrespectful. From talking to me in a condescending tone to pressuring me to help them even though I told them I had a long day at work. At this point, it looked like I would be creating the poster and writing the final report on my own. I was fed up with how they were treating me and having to do all of the work, so I reached out to the professor and TA. The TA shared my email with the other teammates and asked for their feedback. He did not forward their feedback to me. Anyways, these teammates then told the professor a bunch of lies about how they worked on the project and that I wouldn’t take their ideas. The professor sent out an email and basically said he thinks everyone did their part so he gave them credit in the end. I never got an apology or a proper thank you from my teammates.
I would not recommend this course to anyone. The course is poorly planned out, there is no guidance at all. The professor does not care if all teammates contributed equally. If you want to learn DVA, just download the syllabus and research the topics by yourself. Create a project and post it on Medium or Github. If you do take this course, I wish you the best of luck with your teammates. If they are freeloaders, let the instructors know early.
Prep programming (D3/Javascript) in advance. This is a self-taught class, and lectures have almost no weight on your grade.
I like the idea of getting a lot of hands-on programming experience, and using several different tools to do it, but the execution of homework is not great for a couple reasons.
- The instructions are all over the place. Some is in the 20+ page HW PDF while others is in the code templates.
- The autograder is awful to work with. There have been several questions where I have a working solution that matches the DOM template, but it is not built the way the autograder is anticipating so I get 0 credit until I completely redo it.
I wish it was broken up into smaller chunks like CSE 6040 HW.
I’m OMSA and have limited programming experience beyond python and R, mostly just what I’ve gained through other classes in this program. If you have a ton of experience and know what you’re doing, great, this class probably won’t be that hard. If you’re like me, this is the worst class I’ve taken in the program (it’s my 10th).
There is essentially no attempt made to teach anything. They give you assignments, maybe they gesture at some tutorials, and expect you to figure it out yourself. You will spend hours trying to do the most basic things and for the most part it’ll feel like a giant waste of time. TAs will help with some debugging, but if you can’t figure some things out, you’re most likely just out of luck.
The only “teaching” function they take on is providing and grading assignments. The lectures are a complete waste of time. I have no idea why I’m paying tuition to teach myself using resources completely independent of this course and program. What really drives home that this class isn’t about learning but instead just processing students through, is that they don’t provide solutions. So, if you can’t figure something out on your own, you’ll just never know and never learn. I assume it’s so that they can change assignments only minimally semester to semester without worrying about solutions circulating out there. To me, that’s prioritizing the course being lazy over providing a quality learning experience.
Fortunately, the third and fourth assignments aren’t anywhere near as painful as the first two - in particular the second assignment. As suggested in other reviews, read the D3 book ahead of time if you can as it’ll make your life easier once the course starts. Don’t be afraid to try and game the grader - in some cases if you can’t get things to work properly dynamically, you can figure out the specific case(s) the grader is checking and hard code the solution. I think I spent 30-40 hours each on the first two assignments (if you include time spent reading the D3 book). Then probably <20 hours each on the subsequent assignments.
The project can be a bit of a cluster. I had multiple team members withdraw from the class and consequently drop from our team the week leading up to the first project deliverable. We’ve cobbled something together for the final deliverable, but have no idea how it’ll be graded.
The class covers A LOT of material but never goes too deep into anything. The HWs are barely related to the lecture material.
There are 4 HW assignments of about 5 questions each. HW2 is the hardest There are no Exams There is a big project throughout the entire class with 3 deliverables.
To see my full review, check out my youtube channel: https://www.youtube.com/watch?v=XuXsg8H-jy8&t=10s
This is a review from an omscs student. Most other reviews you read will be from omsa students.
First I wanted to say, this class has to be great for interviews. Immense breadth, not a lot of depth. Interviewer asks “do you know ___” or “have you ever worked with __”. This class will add 15 technologies to your repertoire.
Difficulty: This class is one of the easiest ones I’ve taken, not to mention among ones that are required for the ML track. Homeworks are not challenging if you’re good at programming. Some JavaScript, some SQL, and a good amount of python. Easy. If you know how to code and have any experience with SQL, or better yet, have taken 6400, you’re all set to ride along.
Group project involves talking and collaborating with people tho, which we all know is draining by itself (I’m looking at you, nerd). But there are no tests whatsoever, and lectures are so short, so that makes up for it. Group project requirements can be satisfied with some trivial sklearn prediction, even if it’s not that accurate.
I finished the class with a +100, something i cannot say about other classes (cough cough ML and RL). After all, there are no surprises, you submit the hw as many times as you can until you get 100.
I listened to the advice of the administrators in omsa.ga.
During your holidays, dedicate 3 weeks to learn D3.js at this website.
Yes, the entire 35 episodes for it. Write a note to thank me later.
That is literally the life-saver for the entire class cause 20% of our grade is dependant on D3 (part of HW1 and whole of HW2). You will free up more time to focus on your group project.
Yes, some group project prospectors would require you to have already done so before joining their group, because they want you to share their workload in their project later on.
A very well planned course that makes sure you have a basic knowledge about data and visualization. The visualization part covers D3 and Tableau while the data part covers Spark. Knowledge of JS/Python helps but if you already know D3 and Spark, HWs should be very easy.
Grading distribution is 50% on individual home works and 50% on group project. Good thing about it is there are no exams; and quizzes are for bonus points only. Bad thing about it is that the grading highly depends upon the group you work with. (Luckily, I had a decent group to work with resulting in A) You along with your group choose what to work on in the group project from scratch. They have taken extra steps to autograde all HWs so that the scores that you see in Gradescope are most likely final, which is great! (their claim is that they are unaware of any other attempt to autograde D3 assignments)
TAs are very reasonable both while grading HWs as well as group project. TAs are assigned to group project based on the domain you choose, for e.g., we choose to work in pharma domain and our TA had a MD in Medicine and was a data scientist in healthcare domain. When students found some issues with the autograder for one of the HWs, TAs announced that 100% of points will be provided for 80% autograded score (and for 100% autograded score, they awarded bonus points!)
Overall, I highly recommend this as one of the electives for anyone interested in ML specialization.
4 Homework assignments, and 1 group project.
HW1 APIs: Medium difficulty, but I can see myself applying the concepts learned.
HW2 D3: Very difficult and tedious, it took me 2 weeks to finish it (with full time work), I can see myself applying the concepts every now and then. I actually applied those concepts in the project.
HW3 PySpark/Scala: Medium difficulty, this was one of the more fun assignments we had. You actually get to use real big datasets on AWS, Google Cloud, and Azure.
HW4 Random Forests: I found this to be more of a filler assignment because I have used and applied the technique multiple times in the past in other courses and my profession.
One thing about the assignments though, they are VERY verbose. Each assignment had 3-5 questions, but the pdf for them was 15-30 pages. The wordy nature did provide some guidance, but I think when you start the assignment, it seems daunting.
Group Project: I liked that the project is worth half your grade, and throughout the term you have to send proposals, and project status reports (also graded) which force you to stay on top of it. As far as the grading is concerned, if you submit a report that shows some innovation from an analysis perspective, and application of visualization concepts, you will be fine.
The lectures provide a good overview of the concepts and they really help with the bonus quizzes, but you need to go way beyond the videos to complete the assignments. Overall, I liked this class. Being picky about axis ticks, and titles did get a little tedious especially since Gradescope is unforgiving about those details, it showed the importance of visualizations when trying to tell a data heavy story.
This class was very hard for me and I’m glad it’s over. That being said, I believe this class really challenged me and I learned a lot.
The project is good if you can get a good group. Our group was lucky in that we all lived close together and were able to meet in person once to get to know each other a bit.
A lot of the homework feels impossible but finishing it is so satisfying. It just takes a lot of time especially if you have not previously been exposed to this stuff.
The videos seemed worthless and I hardly watched any. Other classmates seemed to share this experience.
Pros
- A great example of an “applied” course. No exams or silly memorization, which is fitting since analytics and data visualization are “hands-on” topics.
- Assignments do have their problems, read my course Cons below, but they do provide good hands-on experience with various visualization tools (D3.js, Tableau, etc.) and computation platforms. If you want practical experience with visualization and computation, this is a great course.
- A subset of the TAs are courteous, patient, and helpful. They make the course more pleasant and provide positive examples of what TAs should be.
- As long as you get into a great team early in the semester, don’t sweat the group project. Rather, keep an open mind about how working with a smart, diverse, team throughout the semester can enhance your course experience and take-aways. I typically despise group projects, but was fortunate to have made it into a great group. I remain thankful for my group’s hard work and different viewpoints provided throughout the semester.
Cons
- The group project can turn into a problem if teammates don’t pull their weight, or make commitments they don’t deliver on. Be honest and open about tracking each member’s contributions, and don’t hesitate to speak up within the team, and eventually to instructors, if some team members are just along for the ride.
- The assignments need improvement. Some of the requirement documents are indeed cumbersome to read through or don’t do such a good job steering you away from needless time sinks. On one assignment I found I was losing points due to what otherwise appeared to be a benign data type conversion. Piazza had similar posts from other students on the same assignment. Other assignments had considerable issues wrt how the auto-grader was implemented. And it’s not that the auto grader itself is bad, it’s how the automated tests and condition checking was setup; this is something TAs state they’re still improving for future semesters.
- Some of the TAs are noticeably condescending, or don’t hesitate to suggest or imply how silly they think some questions are.
After almost completing this course, I feel like it took me a lot of effort and I didn’t actually learn much besides some visualization techniques. Homework was not very hard and even that famous HW2 was manageable. I did do JavaScript and D3 introductory courses before I started DVA though. On the other hand, I didn’t like the group project. I did A LOT for it, probably 70% of all the work and it was exhausting without getting much learning back, for example, I knew already how to write a report or prepare slides, but this work is a bit boring and doesn’t add much value.
Background
I work as a data engineer and this was my 8th OMSA class. Myself and my project group are expected to get high A grades.
Review
Incredibly poor class structure. The class seemed to be an exercise in assigning rote, administrative tasks rather than an educational enterprise.
The 4 homework assignments consisted of 10-50(!) pages of instructions that somehow introduced more ambiguity into the process, leading to multiple Piazza clarifications, these were necessary because points could be (and were) deducted over tiny aesthetic infractions.
Lecture videos bore no resemblance to the assignments, and so are not worth watching, nor is their much inclination to do so after following all the details of each homework. When I would tune in, it was like an entirely separate class.
The actual work itself is about where you would want it to be for an upper-level graduate programming class. I have strong Python experience so did not have tremendous trouble with HWs 1, 3 and 4. HW2, the D3.js assignment, was a struggle for me as I had severely limited front-end experience, I got through it with enough time, despite autograder issues.
Autograder issues persisted throughout the entire class, all 4 homeworks had at least one major issue with the autograder, to the point that HW2 was curved upwards to compensate (submit 4/5 questions for 100% or 5/5 for extra credit). Combined with the seemingly never ending instructions this made for a frustrating experience.
TAs were sometimes helpful, mostly not, clearly sometimes without answers but not wanting to admit it. This contributed to a lack of trust between students and TAs, it made it difficult to tell whether an error you saw was due to your own misunderstanding or autograder issues, with TAs obviously trying to answer as ambiguously as possible.
The group project guide is over 6,200 words long. Full of asides and specific file formatting instructions to follow. The project itself is as hard as you make it, our group built a Flask web app and were successful in delivery and execution, but a much a simpler project would also have met all the marks just fine. Again most of the challenge was following detailed, but largely unnecessary, instructions to the letter.
If this class were rebuilt from scratch and trimmed down severely, it might even be a good class. As of now, that is not the case.
If you’re in OMSA, you have to take this class so the best you can do is form a solid project group as early as possible, brush up on D3 and grit your teeth and get through it. You won’t learn a thing but that doesn’t appear to be the goal of this class anyhow.
I took this as my third course and it was… meh? I’m a software engineer by day, so I’m comfortable with coding and it wasn’t too difficult. The most difficult parts of the course were D3, as I have little front end experience and the HW felt like being thrown into the deep end (although the recommended textbook was an excellent resource and really made the HW manageable), and the tedium of the group project.
Digging into the group project, it was my least favorite part of the course. I’d probably recommend taking this course later on in your course load since the project is extremely open ended and 95% of what I applied to the project was from material outside of this class. My team was nice and our project was decent, but the project deliverables were so arbitrarily complex (e.g. answer these 11 questions + cover a 12 citation literature survey in a 2 minute presentation) and stressful that it really ruined the project for me. Towards the end I checked out, since the grading was very lenient, I was doing well in the course, and I didn’t have much patience left.
In terms of the content covered in the course, I thought it was all over the place? I didn’t really see a theme between everything covered- I think focusing on fewer topics and going into more depth would be beneficial. Also, there isn’t that much of an emphasis on visualizations in this course; we covered a little bit of tableau and D3 in HW2, but that’s it, really? Everything else felt very surface level and I don’t know how much I’ll actually remember. Overall, I did enjoy the homeworks, although HW3 felt very repetitive.
I think it would be beneficial to focus more on visualizations (more time spent on tableau, seaborn, matplotlib, etc) and refine the scope of the course; I left it feeling like I know a little bit of a lot of different, somewhat related things, but nothing in much depth.
The class is difficult, no doubt. The challenges are that, first, homeworks take time. Writing the code is one aspect, debugging is another. However, even with limited coding experience, I found the coding part similar to how I did things in real world - lots of frustration only to find out I forgot a comma somewhere. I reached out to the TAs with a question on my code a couple of times and they were helpful. The only thing I found poorly executed was HW 4 - Q2, not enough instructions. I ended up not being able to do some other questions as well but not because of a lack of instructions but because of my skills. Second, the group project is hard. The usual group work dynamics apply - you have to be patient with others, you have to depend on work of others (which you may not always agree with), things usually get done the night before or of when something is due, people don’t always put adequate amount of effort…come prepared that this is a possibility in this course, frustrating - yes, manageable - absolutely. There are a lot of small details on the group project that at some point become frustrating to students, especially at the end when you just want the semester to be over - naming files in a certain way, writing abstracts in an academic style, breaking down sections in a very publication style way. This is frustrating, I get it. However, if you come out of this class with at least some resemblence of a deliverable, and you are someone who has not been an analyst/data scientist/developer before, you have a portfolio that you can talk about in front of employers. Coupled with your practicum - that’s 2 deliverables you can talk about instead of just textbook formulas. That’s the point of this class as frustrating as it is. To the people who are venting below - relax, may not be the best class but it serves a purpose and not everyone is going to like it. If you get below A - who cares, a few years from now, you’re not even going to remember it. The fact the professor is not engaged in this class is true, though unfortunately it is not a unique problem to this class. Very few of my other 9 classes in this program had a fully engaged professor.
Overall, course absolutely sucked. No instruction, horrible instructions, demands expert-level knowledge in all technologies surveyed. Why did I just pay $1k for this crap?
Comments about your effort (eg., - was the expected and expended effort appropriate for this course?).
Absolutely not appropriate. The homeworks assess if you are an expert-level user of each of the technologies surveyed. If you are teaching a survey course, you should not expect students to acquire expert level knowledge. In no business in the world are you expected to learn a brand new technology and deliver an extremely high-quality product in only 3 weeks.
In addition, it is ridiculous that a product can be absolutely perfect and fail to pass an autograder. I’m not just talking about D3 - My python workbooks, in environments set up specifically for HW3 (the environments you and your staff set up), were working perfectly and outputting exactly what was necessary, and the autograder failed to run my code. Unbelievably disorganized. I spent more time on HW3 getting my correct results to pass the auto grader than actually doing the work.
What were the best features of the course, such as lectures, activities, assignments, and projects?
Working with a group cross functionally was great. We assembled a team that had a variety of skillsets.
How could this course be improved?
I swear to god the directions for the project are longer than the maximum length for any deliverable. We were docked points for not having enough detail when we literally could not fit any more detail in because of the extremely low maximum page requirements. I understand the need to be brief when presenting to executives, but executives will never ask anywhere near the amount of detail required by the directions.
Directions for project and homework often contradicted themselves. TA posts on Piazza contradicted the project instructions and posts from other TAs. Polo rarely if ever participated.
What was the greatest strength of this instructor?
Being smart
How can instruction be improved?
Participate in discussions
Actually teach the material that is being asked in homeworks
Clarity and brevity of instructions
Clarifying things when TAs post contradictory answers
Staying engaged at all
All of the lectures were just Polo talking about how smart he is and how great his accomplishments have been. Instead of me paying $1k to watch you flex on everyone, maybe think about actually instructing your course.
A course where a handful of students (mainly from a certain track) were basically found out that
- their coding standards are basically not good enough (did not do enough CS/CSE “C-track” classes); and
- their data analysis and machine learning theories aren’t up to scratch (did not do enough ISYE “A-track” classes); and
- they are over reliant on sub-5 hour workloads in their electives (check out those MGT “B-track” classes); and
- they could not get good project buddies either because (1) people find it hard to work with them due to the lack of technical skills; or (2) they were too lazy to find a group and did it last-minute.
So they attempt to moan as much as they could.
Yes the course design could be improved - Gradescope on something that’s supposedly visual ain’t a good idea to be honest, this prevents me from giving a “Strongly Liked”.
However, the continual bashing of the difficulty of the class? Give me a break.
I am all for this class being a gatekeeper of sorts for people to graduate with the OMSA degree.
Cleared due to OMSCentral Owner being greedy.
First, I got an A in the course. I took this my last semester and I have great grades.
This was the worse class of the OMSA program. I could go on and on and on about why this class sucks. The course is NOT hard, it is tedious and poorly taught (or not taught), TA’d, constructed, and graded. I spent most of my time setting things up, working through system issues, and TONS of waiting to access sites and grading system. Hot waste of my time.
Be prepared to pay $1000 and teach yourself everything. Waste of my time and money. I could have paid ½ the price and taken a better course elsewhere on data visualization. What a joke.
(And for those reviewers who state the people who struggle are OMSA students (vs CS) or OMSA students from ‘other tracks’–get over yourself! There are plenty of other courses which are far more rigorous with great reviews. This class has sh%# reviews because it is.)
I was looking forward to a data visualization course but feel disappointed here. The teaching felt minimal and TA’s seemed over-worked. Information is scattered across multiple platforms (Piazza, syllabus, project website, homework instructions, homework skeleton code, etc.) where parts of instructions can be found in multiple places making it difficult to piece together at times. Overall the course felt like it was trying to touch on too many various “data” topics, rather than focusing on data with “visualization”.
The course was structured as 50% homework and 50% group project. The homework was divided up across 4 assignments, all mostly auto-graded. I expected most of the homework to be devoted to visualization techniques but I’d say only about a third of the homework actually was (if that). Some of it felt like CSE6040-part 2. One homework assignment was on how to handle big data - which seemed like the actual topic of interest for the professor rather than visualizations. The non-visualization assignments were hard for me to get motivated for. The coding was frequently more annoying than “hard” as you have to try to guess what the autograder is looking for (your answer could look right but it’ll fail for a DOM structured differently or sometimes for silly reasons like the size of your dot was supposed to start at 3mm, not 5mm, before enlarging to 9mm, not 7.5mm, with mouseover…).
Overall, I really enjoyed the visualization portions and wish there could have been more of that. Would be even better if they update the lectures to teach the techniques on the homework rather than having students find lessons elsewhere. I do not feel like my coding in Python is any better and I’m frustrated it was so much of this class when it seems unrelated to what I expected. I do feel appreciative of the exposure to Azure, Tableau, Google Cloud, AWS, and d3.js. Wish the class would have covered Shiny as well and other options in data visualization.
Just too much work. I don’t really see a point in this rat race. Unfortunately the class was mandatory for me (OMSA).
There’s not much teaching in this class, so you’re basically expected to learn on your own to complete the HWs. Not ideal if you’re not coming from a CS background, but I have definitely learned a lot. You’ll have a somewhat easier time if you come from a CS background. In some cases I’m certain my code was ugly and/or inefficient, but no HW solutions are released so you can’t really learn best practices. TAs don’t give much guidance and I feel like a lot of Piazza responses from TAs are after someone has requested input and then later the TA responds and says, “Well, it looks like you got it figured out,” or “It works for me, so keep trying.” Helpful. If you miss something in the syllabus or on the website and ask about it, responses can be unprofessional and rude.
One good or bad thing depending on how you look at it is that things are auto-graded via Gradescope, so you can have your visualization working and still lose points if it’s not showing up exactly how the autograder wants to see it.
The project requires academic literature reviews, posters, and portions of it aren’t worthwhile. Doing a visualization project and a group project is fine, but all the other mess seems extremely out of touch with why people are probably pursuing a degree. A large group project can also be challenging as that portion of the class is no longer asynchronous when you meet, and everyone is spending a lot of time on the HW so it’s hard to focus on the project as well. Especially when parts of it don’t seem valuable.
Videos aren’t very helpful and mostly discuss concepts. If not for the few small bonus quizzes (the structure of which is silly), there wouldn’t be much reason to watch the videos.
Overall, if you work at it you will learn a lot in this class. You’ll learn it on your own and it will be a bad experience and you won’t learn best practices, but you will learn a lot… if you’re able to retain it all. Too much is covered in a single class, and the class should be broken up and redone so some teaching can actually occur and so topics can be covered with the depth they deserve. If you don’t have to take this class, I would recommend against it.
I’m coming from a CS background. A lot of breadth, not a lot of depth, as I was expecting though. Lectures are pretty bad. Good JS practice though, but D3 isn’t really used much in industry anymore I feel. Good survey class for data engineering.
As an OMSA student I heard lots of horror stories about this course, some are true some are exaggerated
I was already strong in Python and PySpark which helped a lot with the HW1, 3 and 4, so they took up way less time than I was expecting. If you aren’t confident in Python then HW1 and 4 will definitely take up a lot of time. If you scored highly in the CSE6040 exams you will likely be ok
I didn’t watch any of the lecture videos and from what I gather from others there’s no need. Does make you think what we’re paying so much money for though?
There’s no getting around that HW2 is a beast and easily took me more time than the other 3 HWs combined, I had never coded in D3 before and would recommend watching some tutorials to soften the blow beforehand
HW3 was the best HW, getting a whistle stop tour of GCP, Databricks and AWS while getting to use (Py)Spark was great because these are super relevant in industry
If you’d like to practice Spark before the class you can set up a free Databricks Community account where you get assigned a small free cluster with Python and Spark preinstalled which you can play with, very useful.
HW4 involved coding up a Random Forest which was tricky, but there are a million blog posts out there that have done it before which can help
The project is as difficult as you make it. DO NOT overcomplicate things or you will end up not submitting for some parts. Every part of the project averaged 90%+ across the class, so just having a reasonable submission will likely get you most of the marks.
Our project consisted of an Excel spreadsheet, a small notebook and a tableau dashboard and we scored 100% with a comparatively small amount of effort to the projects I peer-reviewed. Having a simple project idea which solves a problem and you can talk about (there is A LOT of writing) is more important than having some super advanced ML model. The extra time saved on the project complexity also gives you more time to finish the HWs!
However the most important part is having teammates that pull their weight irrespective of their technical skills. I see a lot of recommendations to form a team early, which I agree with, not because you need a diverse set of skills but forming a team early means you are more likely to have an organised, competent team. My team (all OMSA) were great even though half were in Europe and half in the States, we assigned a project manager early which helped with the flow of team meetings and everybody did their tasks on time
Overall I wish I’d paired this with a lighter course and if you are confident in your Python skills I would recommend you do this too. I scored 100% overall and (aside from HW2) didn’t really feel stressed
This is a heavy coding course, but not hard to get good grade. It’s good to expose us to many tools and platforms, but the limitation of packages to use when doing homework is killing me. I have so many ways to approach the goal but I can’t use them because they are not in the package/version this course allows.
While many of the things are useful in future, there are also something that took the most of time and effort, but it’s totally not useful for many of us in our career - D3.
Besides the coding, nothing hard. The logic and stat behind each hw and project are fairly easy. (I have math/stat background, little programming experience, coding does take time for me. Not recommend this course to coding beginners as debugging will take forever and there are 4 hws and 1 project, so lots of coding.)
As of the project, be prepared for anything, recall the pleasant and not so pleasant teamwork experiences in college.
This was a tough course but manageable if you pick a good team with members that contribute fairly to the project (which given how you’re supposed to find a team is based purely on luck). The nature of the course is quite brutal given the difficult homework assignments, high-level lectures that don’t complement the assignments nor provide coding examples, and TAs’ helpfulness only extending as far as what you can search online. You are for the most part on your own to figure it out which I don’t mind, but the added pressure of the project that’s worth 50% of the grade, not to mention also having to work around other people’s schedules, skills, and time commitments make this course incredibly hard to enjoy.
After reading the reviews, I thought this class was going to be a nightmare. I took Simulation at the same time and my DVA project group thought I was crazy.
Final take: not as bad as expected.
The workload was on the higher end especially for HW2, and it did take some juggling priorities to manage in combination with Sim. However, I’ve definitely taken harder classes in the program like CDA (ML) and TSA. I did take a JS course and a D3 course over the break and that helped. I also got introduced to my project team through a classmate from a previous class and the team was great. Having a good project team definitely eliminated a lot of stress.
DVA is really a survey course that exposes you to a lot of different topics but with little depth. D3 is kind of cool, but I’ll probably never use that. Using GCP, AWS, and Databricks in the same assignment was interesting because you quickly saw the similarities and differences. Much of the other information like clustering, classification, graph analytics, ensemble methods, text analytics, etc. I’d done in other courses, so that was repetitive. I can imagine that if coding is not your strong suit and you’re not used to bouncing around between languages, DVA would be difficult. Taking the class toward the end of my coursework probably helped, since I was already familiar with many of the concepts.
I thought the grading was fairly generous despite the pedantic reputation of the class. With bonus points, I made 100+ in this class, despite making a 70-something on one portion of the project. Also, there are no tests other than bonus quizzes which is nice.
Are there better classes? Yes. Are there harder classes? Yes. This will not be your favorite class, but mentally prepare for the workload and you’ll be fine.
I came from non CS background with little programming experience, but was able to get 100% in this course. It is not hard, it is tedious. I spend a lot of time on this course but was worth it. However there are too many topics taught in one semester. It would be better if it was purely.a data analytics course or purely visual analytics course or ML course. I am not sure if I will be able to retain everything, as too many things were cramped in together.
Seems that others said it all. While I got slightly less than 100%, I still think this is a very bad course.
A class with a lot of potential, but misguided in execution. In the spirit of not repeating everything that’s been said before, I’ll try to summarize in reference to prior reviews.
The Good
- You will learn JavaScript and D3 very quickly. I didn’t know any HTML, CSS or JS and this was a fantastic way for me to get some understanding very quickly.
- You will get the gist of Spark, so you can be conversational in it
- Homeworks overall do make you think and work for the points
- TAs worked hard to clear up confusion on the homeworks and address pain points
The Bad
- Project deliverables are too academic for this class & the people who are taking it. Literature review? Testbed? Poster presentation? Class would be better served by having a project that more directly focuses on ML and visualization than hitting academic check-boxes.
- Covers too many technologies. Homework 3 covers PySpark, Scala Spark, Databricks, AWS, and GCP. While all are interesting, this leaves students ill-equipped to speak with confidence about any of them. In my CIOS response, I made a lengthy case that students would be better served from getting deeper knowledge in one technology and one platform (e.g. PySpark on AWS is probably best).
- Be prepared to do the entire group project with 3 members rather than 6. RE-READ THIS PLEASE. I naively tried to get uncooperative group members to contribute to our project, and it made the whole experience way worse. There will probably be people in your group who give 0 shirts about contributing, and want to ride off the others. It’s just a fact of the OMSCS/OMSA group project dynamic; be prepared to deal with it.
- Lectures are so superficial as to not be useful
I wish they stayed focused on Data Visualization. The course really just touched the surface of so many things, spreading endlessly wide like a drop of oil on the ocean surface. Even within one homework, as soon as I finished one question, it’s totally forgotten, too much info to digest and remember. Difficulty is low, it is just the amount of work, and its spread, that kills you. It is like running a long never ending boring marathon. If you truly do this course right, it will take more than 30 hours per week. I see some putting 15-20 hours, but I having 20 years coding experience in all languages, I keep SMH. I had to drop a course, I couldn’t keep up with both of the barely made it within ~30 hours per week. So I keep SMH when I see 15 hours per week. ( I had no problem at all finishing CSE 6040 & ISYE 6501 both at once)
D3.js is covered nicely. I enjoyed D3.js as well implementing Random Forest, Microsoft ML and trying out some other vendors. However, implementing Random Forest, or trying all large database implementations, I think it was off topic. Mixing so much and many things in such little time doesn’t help, it becomes counterproductive. I already forgot most of them. If this class it is meant “make it or break it”, yes, than it achieved the goal.
After 2 weeks, all I remember is ~ 30% of it:
- Principles of Data Visualizations (Tufte)
- D3.js,
- Microsoft ML tools.
- The long and constantly demanding course project.
Prof. Polo is very friendly, supportive and active. I believe he is trying to reach too much with this course but overall I certainly believe is attention is to make this a great class.
There’s a lot of negativity in these reviews, so I wanted to offer a more positive viewpoint. I will caveat that my day job is a FE engineer, and I was already interested in Data Viz, which is why I signed up in the first place.
The assignments were a bit contrived. There are 4, and each one comes with a 10+ page instructions doc that goes step by step through everything you need to do. Sometimes on the order of “click this button, type in this thing, click that button”. It was very prescriptive. But I had no trouble completing the assignments. There were I think 2 actual assignment corrections that people got mad about, but given there were hundreds of instructions/things being graded, that seemed acceptable to me.
Each homework question has a Piazza megathread, so you can ctrl+f your way to the relevant post. For this reason, I recommend waiting a few days after the homework opens to get started - at that point many of the confusing bits will be clarified and other students will have worked out the kinks for you.
Yes one homework is all about D3. That can be super daunting for people with limited web dev experience. I have plenty of experience and I still didn’t like D3; it doesn’t follow any of the best practices I’ve come to love. But! I super appreciate the flexibility D3 offers and I’m thankful I know a bit about how to use it. That will definitely be useful later in my career (even if just to veto the use of D3!).
The group project is what you make it. Have fun with it! Find a group with a good mix of abilities and set your expectations with everyone early on. I had a great group and we executed quite well.
It was annoying how the last few homeworks overlapped with the group project. I’d rather have spent more time on the group project.
Another complaint I read about is how you survey a bunch of technologies but don’t go deep on any of them. Take it in stride! You’re gonna learn a bunch of new tech and what it’s good for, quickly! I’m not sure why people think that’s a bad thing. You can always go back and learn more about it later. I’m happy I got the chance to learn how to use AWS, GCP, Azure, etc etc. If I ever need to do it again, I won’t be a total n00b.
The lectures were enjoyable to me, but they don’t actually count (outside of some bonus points). If you want want to watch them, then just don’t.
Just my two cents. I really liked this class, but I took it knowing it was something I was already interested in and pretty good at. I did pair this class with ML4T and still felt the workload was manageable.
FWIW, I’m OMSCS and this was my 4th/5th class. I don’t recommend this as a first class.
This was my first course in OMSCS program. Not sure about others but I genuinely liked this course because I learnt a lot of new things. Here are my evaluating metrics
1) Lectures (3/5)
Pros: Lectures were very informative and gave good intro about the week’s content. Other than D3 and Pyspark syntax part, I need not do any additional study to do any of the assignments
Cons: They lack the depth required in the assignments ( specially D3) and also need a revamp as the version used in the lectures are very old.
2) Homeworks (5/5)
They were challenging and interesting and I could feel the hard work done by instructors to create them.
HW1 : Involved 5 parts. Graded on Gradescope. I started early and took me 20-25 hours to finish it given 2 weeks of time.
i) creation of graphs, some graph related methods and display the nodes and edges and visualize it on a given tool.
ii) Write complex SQL queries for SQLite
iii) D3 warmup
iv) Open Refine
v) Python Flask
HW2 : Involved 5 parts. Manually graded by TAs. I started early and took me a week’s time to finish it given 2 weeks of time. It was the most difficult assignment for most of the class but I found it most interesting.
i) some barcharts on Tableau
ii) Forced Directed Graph Layout on D3
iii) Line charts on D3
iv) Line charts with interactive visualization on D3
v) Choropleth Map in D3
HW3 : Involved 5 parts. Manually graded by TAs. I started early and took me a 4 days to finish it given 2 weeks of time. Mostly time was spent in setup of all environments.
i) Complete methods and Run Pyspark notebook on Docker container
ii) Complete methods and Run Pyspark notebook in Scala on Databricks
iii) Complete methods and Run Pyspark notebook in Python on AWS
iv) Complete methods and Run Pyspark notebook in Python on GCP
v) Some easy experiments with Azure ML studio
HW4 : Involved 3 parts. Graded on Gradescope. I started early and took me a week’s time to finish it given 2 weeks of time. It was most difficult for me as I was new with ML models, decision tree and Page Rank.
i) Implement Page Rank and Personalized Page Rank Algorithms
ii) Implement Random Forest Classifier
iii) Create and train various ML models using Scikit-Learn
3) Group Project (3/5)
It had 50% weightage on course grades (don’t know why instructors decided so). Although there was learning during group project and our outcome was nice but my experience was very dreadful.
NOTE FOR INSTRUCTORS
In case you are reading this,
This time I worked really hard and my teammates got free marks for that. Please make it an individual project so that others can benefit from it.
Pros: i) Learnt a lot of new algorithms while reading lot of research papers
ii) Learnt more about Data collection and Cleaning
ii) Learnt more about ML models and D3
iv) Latex
Cons: i) Despite carefully choosing a team based on skills, got a bad team
ii) Had to do 90% of the work alone as other team members were only active two days before submission deadlines
iii) You can’t make your team members deliver the things they committed to do.
4) TAs/Instructors (5/5)
They (35+) were helpful atleast I found them useful. They replied on everyone’s piazza post within 8 hours and often used to review our codes in office hours. Once I was wrongly graded in quiz and HW , I reported it and it was quickly resolved on submission of regrade requests.
5) Bonus Quizzes/Questions (5/5)
They were entirely based on lectures and someone who has watched all lecture videos carefully and can easily score 100%.
On a final note, it was an easy course for me ( maybe because I am a software engineer) and I was idle half the time and wondered why didn’t I took one extra course with this. My advice would be start early on every assignment and project deliverable so that you don’t have to rush towards the end. If you study regularly for 1-2 hour daily, you can get an easy A, I believe.
As someone who has experience in basic programming concepts (not a software engineer), this class was not as difficult as I thought. Yes the D3 was difficult, but it still applies the same basic debugging practices, and programming concepts I’ve encountered in previous courses. I also do recommend getting familiar with Javascript before the course starts and I agree, there’s no way someone could learn Javascript and complete the D3 assignment in the allotted time frame.
However, this is hands down the worst class I have ever taken (including my undergraduate experience). Besides learning about D3, no other content was related to “Data and Visual Analytics”. It was basically more python practice with using a basic API and how to call methods in scikitlearn, applying SQL statements, and SQL like methods in technologies such as Spark and using the same concept in some cloud computing platforms. Then out of nowhere, creating a decision tree from scratch. I’m sorry to say, but the lectures are also awful. It feels like a salesperson talking about a product instead of actually learning any actual concepts and how to apply them to any situation. Other people saying “the lectures are only worth 3 bonus points” doesn’t mean anything regarding the learning experience. I don’t watch lectures for a letter grade, but to actually learn and the lectures are a waste of time. I could go on, but I haven’t even discussed the worst part of class: The group project.
This mandatory group project is an absolute nightmare. You are forced into this matchmaking arena before the course even starts to showcase your background like a dating profile, and hope you match up with compatible people. As a disclaimer, I actually had no problem with my team at all, found my group mates really quick and enjoyed working with every single one of them. Once your legion is formed, you have to come up with a project topic that can literally be about anything regarding big data. However, you are forced to create the most ridiculous project deliverables I have ever encountered. Your initial project proposal document requires a document that cites 18 different scholarly level literature references. All within 2 letter size pages. Not to mention, you also have to answer 9 different questions within the same document. Another example is creating a proposal video, that talks about all your literature references, the 9 questions, and did I mention you only get a 2 min limit? The worse part is when you get feedback and points taken off saying “You didn’t go into enough detail about your xyz algorithm and how it works.” Well yea, WE ONLY HAD 2 MINUTES. Other examples include strict limits on a progress report, poster presentation, poster presentation video, final report, all while working on the actual project itself. These restrictions also make no sense, especially if the requirements are just plain tedious. Our group spent most of our time making sure our documents had “key” words and fulfilled a checkbox of items, and not focusing on writing up project documentation to the best of our abilities. Lastly, all the project grades are based on filling out these unnecessary documents and not the actual project itself! We don’t actually showcase the project, or get graded on any demo, but how we well we wrote about a project concept in some documents. You can include optional demo videos with again a strict time limit, but the rubric states it won’t hurt or help you (which also makes no sense since this should be the most important part of the project). Never had I worked so hard for so little value. I don’t mind group projects, but this project was meaningless. All of us actually used prior knowledge and outside experience and nothing from the class itself in putting together this project. From a tuition standpoint, I am glad my company sponsors this program, or else I would be furious for paying for a course that tells me to work on a meaningless group project.
I will start by saying that I didn’t think this class was awful - for all of its problems, I actually did find much of the content fascinating and felt like I learned a fair amount. That said, the amount of time that needs to be invested to get a good grade in CSE 6242 versus the amount that you actually learn is way out-of-whack.
My Background
I'm an OMSA student who was a business major undergrad and has worked in business analyst finance/consulting jobs since graduating. I'm a professional-grade SQL scripter and was good enough at Python to ace CSE 6040, but I had limited or no experience with other programming languages, cloud computing platforms, or machine learning beyond a theoretical conception of it. I mention this because I think the people who disliked this class the most were software engineers who have familiarity with these technologies, whereas I actually appreciated the class's introduction to them.
The Pros
- Good content for the unfamiliar - like I said, I had almost no background with D3, cloud computing platforms, or machine learning, and I actually appreciated the introduction! Now, I know just a little bit more about web development, I'm a little more familiar with systems that run on the cloud, and I feel like I can do some basic data science in Python. This is why I signed up for OMSA degree!
- Informative lectures - the lectures were also quality content, if you actually watched them. More on that below.
The Cons
- Overall tediousness of homework - while I felt like I learned from the homework, it was WAY more tedious than it needed to be. So much time on the homework was spent figuring out why one little thing isn't quite working right or scouring Piazza (a platform badly in need of upgrading) for the bit of information you need. I would have much preferred if the homeworks were more bite-sized like in CSE 6040 - it'd be much easier to feel like progress was being made and would have made the class less stressful.
- Throwing you in the deep end with D3 - Homework 2 deserves its own call-out. The Suggested Background Knowledge of the class says you should be, "proficient in at least one high-level programming language (e.g., Python, C++, Java) and are efficient with debugging principles and practices; if you are not, you should instead first take CSE 6040 (for OMS Analytics students) and if needed, CS 1301 and CS 1371." This is untrue. It is impossible to succeed on Homework 2 without at least having introductory-level knowledge to JavaScript, it is not good enough just to have CSE 6040-level knowledge of Python. I recommend anyone taking this course take the Introduction to JavaScript on CodeAcademy before. All that said, I actually enjoyed learning about D3 and felt like I could make some really cool things with it! But the path to get there required the use of wayyy more outside resources than a class like this should.
- No grading incentive to watch the (informative) lectures - the lectures aren't really linked to the Homework, they are only linked to quizzes that are worth a total of 3 bonus points that can be earned on your final grade. I think this is a shame - I would actually have preferred this class to have tests based on the lectures instead of the group project, because there would have been a much stronger link between learning the actual material and one's final grade in the course. Which leads me to my final con…
- The Group Project. All of it. - just do away with the whole thing, and I say this as someone who was part of a great group that has done well on all components of the project. The idea that we need to be scheduling to meet with others to work on the project just fundamentally runs against one of the major draws of an online program (that you can do it on your own time), and I feel like I didn't learn anything from the project either. It started out with academic research (why??) and then we had to develop our own visualization/model - my group did this off a combination of things they learned from other classes and regurgitating homework things. Because I was the only person on my project who was both a native speaker of English and not a professional software engineer, I got stuck doing research/report-writing. This allocation of roles optimized our group for success grade-wise, but I didn't learn anything new about analytics as a result. Polo points to "learning how to work with a group" as one of the big reasons why the project should be part of the course, but that's not what I signed up for OMSA for! If the degree was undergraduate or on-campus, I think Polo's point would be stronger, but I experience group dynamics every day during my day job and don't benefit from more of them after-hours.
All in all, I think the best way to fix this class would be maybe split it into two classes and go deeper into the content. Have one class that covers Visualization Theory and Practice very thoroughly (and properly introduces students to JavaScript), and have another class for Cloud Computing and Applications that dives deeper into these technologies and shows how they can be used to manage data processing and machine learning models. And axe the group project completely.
If you want to spend a grand to teach yourself everything you need to know, then this is the class for you.
I really don’t understand the purpose of this class or why it is required for OMSA - or why anyone in OMCS would subject themselves to it. You are taught NOTHING. I took a class on and read a textbook about D3 in advance of this class and still felt very unprepared for some of the assignments. The lectures are basically useless. If we’re required to use a language in our homework, why aren’t we taught the language?
The entire class is 4 homework assignments and a group project. The homework assignments are 14+ pages long. They are often confusing and have errors.
The one positive thing I can say is that the grading is quite fair, though you can lose points for a number of very small errors, like an extra space in your filename (this is not hyperbole). That said, the grading on the homework is OVERALL fairly generous. On most questions, you can get partial credit. The final homework, however, had a question worth 35% of the grade (3.5 points of your overall score) that was pass/fail. Get it all right, with the minimum accuracy, or you get a 0. There were also bonus quizzes offered (though only for a maximum of 3 points).
Really, the worst class I have ever taken.
One of the worst run classes in the program
Homework is riddled with errors. Rubric is highly vague and inconsistently enforced. You will have to look up instructions for the project and homework in multiple places (canvas, pdf, jupyter notebook, piazza) and there will be no one place to get all of the information you need to do an assignment correctly which will lead to wasted hours. The TAs do not offer clear explanations when anyone asks for clarification.
The class makes such poor use of grading tools that you ask yourself why they even bothered.
To demonstrate, instead of allowing you to submit a ipynb file directly to gradescope you have to run a helper function to publish the notebook to a submission.py file. Multiple times this helper function did not work properly and would not overwrite the file correct, or would not copy the contents of the file exactly as it appeared in your notebook. This resulted in a significant amount of wasted time on what was otherwise a trivial homework assignment.
Could have been a great course, the topics were very interesting but it’s one of the most disorganized programs in the entire program and with poor quality control. Stay away.
This class isn’t hard in that the work is hard, but this class is hard in that it’s incredibly fatiguing and hard to find motivation to finish. This is my 6th/7th class in the OMSCS program, and I’ve taken “hard” classes like Reinforcement Learning and GIOS, but I found this class much harder.
I don’t feel like I’ve learned anything from the lectures, the homework, or the project. The lectures are dry and lack depth. The homework is extremely repetitive and surface level. How much you get out of the project will depend on your team and what you choose to do, but the deliverables are ridiculous. A report, video presentation, AND a poster? A poster???
My past self believed that having things like “Databricks”, “Hadoop”, and all the other softwares touted on the syllabus would look great on my resume. My current self thinks it would be disingenuous to put those on my resume because of how little we actually did with those tools. It would be like if I wrote a “hello world” program in C++ and then had the audacity to put “Expert in C++” on my resume.
To my past self, and all others thinking of taking the class for similar reasons: please don’t. There are so many better things you can do with your time. You deserve better.
After this semester, I am halfway done with OMSA degree. Combining with all the classes I took in my undergrad at a different institution, this class was the worst academic experience I’ve ever encountered. Mind you, it isn’t that difficult to get an A here. I will be getting an A unless I get 55 or below in this project, which is quite unlikely to happen. However, the experience is so bad that for the first time in my life, I am motivated to write a review of a course. The reason why this course is terrible is because of the following things:
-
This course covers too many unrelated topics at a shallow level. The course title is data and visual analytics, but in reality, only 1 of the 4 homeworks is on visualization (HW2 the infamous D3 homework). I don’t mind the long instructions on each homework, but those long instructions still do not adequately explain what we need to do at times. One of the questions in the first HW requires us to use a software that the professor made, which will not likely be used in the industry. The third HW is basically do the same things in each question except on different platforms (AWS, Google Cloud, Databricks), and the fourth HW suddenly asks you to build a random forest from scratch. They are definitely not related to visualization, and the topics don’t connect. I’ve been exposed to tons of software, but only at a hello world level. Frankly, I was surprised that they don’t teach things like Python’s seaborn and bokeh, when they are some of the major ways we do visualizations now.
-
I am sure the TAs were instructed to be this way to not spoon feed us, but lots of TAs can be unhelpful. Whenever I asked a question, they just said look here in the instructions, when I asked that question because I wanted to clarify what the instructions said. Some TAs were helpful, but I would say majority were not. The professor is also pretty much absent in the class.
-
Half of your grade is on a really vague project. First of all, you must be in a group project so your grade could be determined by who you will be with, which is a challenge since a majority of us do not know each other in the class. Second of all, the project instructions are super long, but they somehow still miss requirements in that instructions and have to post additional ones in Piazza, which you will likely miss if you don’t pay attention. Thirdly, this project is structured way too academically. Majority of us will not go on to the academic field, but from the very beginning, they require us to read and write literature survey on somewhere between 12-18 papers per group. They require you to make video presentations of your project, which I frankly find meaningless. I get why they require you to make individual presentations of the final project, but I really don’t get the point of making presentations about the proposal in the beginning of the project. It is almost impossible to fail this project, but they are very picky about what they want. They also gave us feedback on the progress report one week before the final project was due, and I’ve seen some groups that were unable to accommodate those feedbacks because there were not enough time.
-
The lectures are absolutely meaningless. HWs and projects almost have nothing to do with the lectures or are doable without lectures. You need them for max 3% bonus points from quizzes, but they post the transcript and you can just search up questions. There are also quite a few moments where he says things like “I mentored this kid for PhD” or “we made this in our lab” and even things like “this lecture was largely based off of this lecture from __ university by professor __” which I find it very discouraging.
I also asked my project teammates just to see if I was the only one feeling this way, and all 3 of them felt the same as I did, and I know at least 2 of them are doing fine in terms of grades.
So in a few points:
- not a difficult course to get an A, but very very time consuming and tedious
*they throw lots and lots and lots of topics (which I appreciate since most of them are applicable in industry), but they are unrelated and shallow in depth
*lectures don’t help. Don’t waste time watching them.
*Start HW early, especially HW2. It might be good if you brush up on D3 basics in advance of this course.
*Choose project teams and topics wisely and early.
This is my 6th/7th class in the program. I would rate this as the 6th best class I have taken so far.
The class is a lot of work with very little return on knowledge or skills. I don’t mind working hard if there is a payoff, and I have found that to be the case in other courses I have taken. But shallow explorations of technologies, accompanied with videos that do not teach you much aside from surface level info about each technology and the professor’s own accomplishments, plus often opaque homework and project structure just don’t make for a good us of time.
This course should be massively overhauled or dropped from requirements from OMSA.
This class is not challenging. This class is not difficult. This class is extremely time-consuming and frustrating because of the way it is structured and manged. It is simply a hot mess of google links, Amazon books, and a semester long group project. As a seasoned developer, I found the coding exercises particularly offensive. Don’t be like me and say to yourself, “I don’t mind hard work. These other people are probably just complainers.” There is no direction or real content in this class. I feel like I could have spent my time doing self-learning over this semester and it would have been time much better spent. There is no real discussion and the video lectures and slides are like marketing blurbs for products instead of providing anything of substance. I actually considered quitting the GATech program because of this class. After speaking with some teammates, this class is more of an aberration than the norm. I will be avoiding any “Polo” classes as a result of this. If he could manage talking as much about the technologies and underlying principles as he does aggrandizing his own accomplishments and connections, there might be a nugget of value here. Avoid this garbage fire if at all possible.
This course has a reputation of being “hard”. It isn’t “hard”, it is just “bad”. As with the large courses in the program, it suffers from too many students, poorly prepared TAs, and a professor that is, for the most part, absent. If you couldn’t tell, I strongly disliked this course. It is the second to last course in my program, and if it were in the first three, I probably would have bailed on OMSA. So, I can gripe about the fact that the course is scattered across a ton of disjoint topics, or the fact that it relies on poorly implemented autograders, or that 50% of the course is based on a project where you can “do anything you want”, except when you can’t. I could complain about how, in one assignment, you need to write code in a jupyter notebook that outputs a .py file that you have to upload to an autograder that provides almost no debugging info, or how you need to be able to implement, for decision trees, python code that leverages recursion, which is never even mentioned. I could complain about how the syllabus is a web page that is about 10 pages long, and references another web page for the project that is about 10 pages long… and yet, somehow, manages to tell you next to nothing. I could gripe about how it seems that half the TAs are arrogant CS students that are out to prove how smart they are. Lastly, I could gripe about the fact that they literally rolled their own docker container for spark… and it just plain didn’t work. You don’t need to do that, folks. There are docker containers that are already pre-made that were created by the Apache spark project… and they actually friggin’ WORK!
But, rather than just go on and on about how awful this course is, here are some tips for, if not success, then survival. 1. Get on good project team and get it early. 2. The project is graded by “box checkers”. Make DAMN sure you understand exactly what is required for each part of the project by understanding the rubric. And, yes, they really want you to sum up the entire project in the proposal video in 2 minutes. 3. Read up on D3 and search the web on how to implement decision trees (you’ll need to understand recursion) and random forests from scratch. (I did both of these, and they made homeworks 1 and 4 go much more smoothly.) 4. The bonus point quizzes can help. Make sure you watch the videos first and get a copy of the transcript of the lectures. The quizzes go very quickly (10 minutes for 11 questions, and no going back to a previous question.) 4. Eventually, this class comes to an end. So, make it through, take a shower, and then move on.
Bottom line. This class sucks. It really sucks. You have to take it… and that makes it suck all that much more.
The course is easier compared to say GIOS but only if you have a background in Python, SQL, and Javascript. Having a strong group for the project makes all the difference. Homeworks 3 and 4 can be completed in three days if you have a strong background. Homework one took about a week, and homework 2 took the longest because of D3. But that homework has a lot of online resource support. Lectures only serve a surface level survey of a topic; they are meant for bonus points. The course also ends early in the term with no exams, which is a plus. I think this course is a great first or second course to pick up in the program, however know that the course is geared largely to experiential learning and surveying some of the most popular analytics technology in the field. You’ll find yourself on several stack overflow posts and online tutorials to complete the work. The work itself is not terribly difficult if you have a Python/JS background. My largest complaint is the homeworks were very wordy and contained some errors that needed to be updated in subsequent versions, which hindered progress.
Is this a data viz class? Or a research class? And if it a research class, why focus on research with a CORE class, when it could be spent learning very useful tools?
I am going to echo what others have said but add a little onto it. For OMSA students, this class is required and why it is…. is a bit confusing. This is my last class (saved for last because I knew how terrible the experience was going to be based on the overall feedback from people I know who have taken it) before I have to do my practicum next semester.
Sure data visualization is a part of the class, but why are we essentially doing a project (as a group, which by the way is a huge dumpster fire in its own sense) that is basically a research paper? To be warned, you’ll have to juggle VERY difficult assignments and do the project, which is a time suck on its own. The lectures in this class are pretty useless (you can pretty much get through the class without even needing to watch a lecture (also you can just use the find function in the transcripts to do the bonus quizzes).
In addition, it felt like the most USEFUL stuff was barely touched (like AWS and other computing stuff, d3, etc). There was no emphasis put on any of the python data viz packages (?????) like seaborn and what not. I understand how data manipulation is important, why not fill in the class with more useful things than have us do a class project where people have to literally scrounge up something and hope something sticks on the wall with a makeshift team.
Also the grading criteria for the projects seems all over the place, getting docked for points on things that weren’t even mentioned they were going to be graded on.
The one positive thing so far is that the homeworks and what not are very useful and interesting. It’s just the dumpster fire in everything else (the class project, the way the lectures are structured) that makes it probably one of the worst classes I have ever taken.
I’ll come fill in more specifics when I have the energy. Right now, it’s more important to warn people (in OMSCS, who have a choice) away from this dumpster fire.
This is my last course. After this (and the practicum), I will have finished both the OMSCS and the OMSA. And after five years of searching, I’ve finally found the absolute. Worst. Class.
Instructions are labyrinthine, when they exist at all. Point deductions are arbitrary, capricious, inconsistent, inscrutable, and not seriously considered on regrade.
Lectures are disconnected, both from each other and the material tested in the homeworks. This is not a course in the traditional sense. It’s just “figure it out yourself, and we’ll give you a grade.” (That grade may or may not correlate with what you turn in. No one knows for sure.)
I’m currently on-track for an A, but I still resent the abject lack of consideration or forethought that went into this wreck of a course.
This is my least favorite course so far in this program. OMSA students have to take it but OMSCS students please stay away. The lectures are very general and only touch on the surface. Then you need to figure everything out for homework, which is really really painful. There is not that much about D3 online that you can find help with. Taking this course was the most stressful and frustrating experience (probably in my life). The professor cut us a lot of slack with COVID though which I really appreciated. At the end of the day, I learned nothing much this course because it covered too many things on the surface.
My main advice (coming from OMSA) is that if you don’t know JavaScript, take at least 2 Udemy or Coursera or similar courses on JavaScript before coming to this class. If possible, take a course on D3 or watch the 12-hour “complete tutorial” on YouTube and complete it. Do all of that before you start this class. Just build it into your previous semester.
My second advice is personal. I have struggled with mental health issues of anxiety and depression for about 12 years now. I was on an improvement track with these issues. Combining this class with the Covid-19 pandemic and a few other family issues has derailed my progress. No joke to say that I’m only 6 weeks into DVA and have met twice already with my psychiatrist and have had numerous discussions with my partner so as to minimize filling my head with self-demeaning talk and negative thoughts. Point being, if you have mental health issues you deal with already, put all of your support systems in place for this class. I literally had to talk myself out of dropping out of GT altogether – and I’m one semester away from graduating.
I’m going to get a B or C in this class. It is ok. Because making an A would probably kill me. So just take it in stride and learn what you can and move on. You are a beautiful, wonderful, smart, and worthwhile human being even if this class might try to make you think otherwise about yourself. Peace to you all.
This course is completely abysmal. Even though I got high marks, it was the single worst experience I ever had as a student. 50% of the grade is based on 4 HW assignments, and the other 50% is based on a semester-long group project. The amount of work is definitely taxing, especially in the final 6 weeks, but the course is made worse by how poorly the instructors (i.e., professor and TAs) run it.
The homework assignments were challenging because students are essentially learning new software or languages that they may have never utilized (e.g., D3). However, this aspect isn’t the terrible portion. The truly dreadful aspect is the fact that the homework instructions were literally 10+ pages long, and they were riddled with mistakes and errors. Students had to satisfy every single requirement, or the TAs would deduct points with limited reasoning. If the instructions changed, or if an error was corrected, or if something unclear was clarified (which all happened far too often), then the TAs usually provided the necessary details in a Piazza post, buried under other posts from students, instead of providing a revised set of instructions for the entire class to see, like the syllabus says they would.
The project is another mess entirely. It is structured like an academic research project. The main focus is to take a “big” data set, perform non-trivial analysis on it (e.g., clustering, classification, etc.), and then present the results in an interactive visualization. In addition, students needed to write a large research paper, similar to ones published in academic journals, and in the paper, we had to write an extensive literature review on 12-18 sources, depending on the size of our group. I understand the importance of explaining processes and results, but too much emphasis was placed on the research paper. The OMSA program is designed for people who intend to work in the private sector, not for those pursuing a career in academia. The time spent writing the paper could have been better served learning more about the tools covered in the HW assignments.
Continuing on, the group project required the submission of a project proposal and presentation, as well as a progress report about three weeks before the project was due. The final project submission occurred a couple of weeks before the official end of the semester. The final submission needed to include: the final paper; all of the code that was written (with comments and explanations); all of the other supporting files (e.g., diagrams, data files, etc.); a presentation poster that explained the completed project; and a README file that outlined everything in the final submission. And just like the homework, the TAs combed through every little detail for all of the deliverables. If something didn’t meet their subjective standards, then they deducted points, and the reasoning they provided was often limited.
The most frustrating aspect of the class was that the TAs and professor are simply not helpful, whatsoever. In fact, some of the TAs were incredibly condescending towards the students. The TAs were the ones who put together the homework instructions for each assignment and every single one was riddled with errors and contradictions. The same problem also persisted with their Piazza posts. For example, there were several occasions in which the TAs gave demonstrably false information, and then the professor had to make a post just to correct them. I was definitely challenged in this course, and I learned quite a bit. However, the class is so poorly managed and executed that it was an absolute nightmare to endure.
We were warned about the amount of effort in the syllabus, so I can’t say it came as a surprise. The TAs and the professor were always available to answer questions fast and well.
However, I’m not leaving this class with a fulfilling feeling of accomplishment. I didn’t do too bad in the end (probably around 89.6%), but I’m still leaving extremely frustrated.
First, because this course is a core requirement, and it should not be. It’s halfway between the Analytics and Computer Science MSc, meaning it’s optimized for neither, and therefore should not be core.
It also should not be core because too much of what it teaches is not core to analytics. Having half the homeworks covering Javascript and D3 is ludicrous for analysts. Javascript is a fantastic language but not a requirement for most analytics positions. Many other technologies and libraries such as R and Python, ggplot, matplotlib, bokeh, seaborn and much more widely used and completely absent of this Visual Analytics course. So in a core requirement course, we have to pick up a tech that we’re unlikely to use, and get zero exposure or deepening on techs we are sure we have to master.
Yes, new positions will come with a requirement to learn new skills and technologies in a limited amount of time. But nobody will ever ask an analyst without JS experience to output 8 different interactive visualizations in 30 days - at least not without a personal trainer.
Although I’m admirative of the work produced by previous students, I don’t see the point in making us use their tools, like the network graph representation one. We’re being graded on a tool that we never used and that is far from being an industry standard. There is no point in asking us to use such a tool, except for boosting usage of the tool, which might be nice for the developers but is irrelevant for the students. That’s not what we paid for nor expect from this course.
I also understand the strictness and rigidness of requirements and the absence of appeal post due-date, but I found myself stressing a lot more about meeting these structural/administrative requirements than about my results. I paid close attention in duplicating the downloaded directories and submitting exactly what was asked, and still managed to submit the wrong notebook for the main PySpark question which cost me 35 points on the homework. It’s a tech I already know, I would have gotten most of the points and caught up on my D3 disaster, and my final grade is impacted because after hours in front of the screen I messed up the upload without a way to correct it later. Granted, it’s on me, but in the end my grade reflects more my ability to follow cumbersome guidelines than to master skills necessary to analysts.
This course tries to do too much. Visualization, big data, three different cloud platforms… I’d prefer this course focus on visualization only. There, we would have three months to learn about D3, which is a much more decent timeline. We could also work with the R and Python visualization libraries, as well as get a certification level with Tableau. Doing analysis on the cloud should be its own course, and big data analytics as well - and it is, so these parts basically just come as an ersatz of other Analytics courses, and are therefore useless in the grand scheme of things as far as the Master of Analytics is concerned.
The lectures are not helpful at all. They just give a contextual understanding but don’t get you any closer to completing the homeworks. It really feels like a “struggle and do it” experience, which is absolutely necessary for mastery, but considering the price paid for the classes I expect more guidance and support than being given projects, being told they are hard, and being asked to figure it out on my own. I see no difference with me doing that with free resources and projects I can come up with myself.
In a nutshell, it’s a core requirement, you have to take it, but it’s a grind. Start the homeworks as soon as they are released and you’ll be OK.
This was my first course in OMSCS, and I was really glad that I took this as my first course. Obviously, that conclusion is based on my personal preferences. Writing/reading, and doing exams has never really been my strong suit. And given I have been out of the academic life for 15+ years, I really wanted a hands-on class that relied more on my programming skills rather than theory. I wasn’t disappointed in that.
Also it helped that I have had a variety of experiences with different programming languages, so that helped adapting to everything that they throw at you in this class. And so I understand some of the negative reviews here, because if you are not relatively well versed with JavaScript, Python, SQL etc. you will need to put in extra effort to very quickly learn to work in these languages. I had never worked in Python but luckily had experience in Ruby that helped me learn Python in time. Nevertheless I still had to put in a LOT of effort.
Now for a more structured review:
Course content
Homeworks
- The homework (4 of them) take a lot of effort and are 50% of grade
- Each homework is like 15 pages long, and has up to 5 questions
- Each question is often a different programming language, technology, tool, or field of area in DVA world.
- You have to really work for each single point (100pts per homework) because there are just so many items to do in each question, and most tasks are granular.
Project
- Group project, with 50% of grade. Form a group ASAP and start brainstorming because homeworks will not give you much time
- A lot of freedom on what you can do and what you choose to utilize
- More emphasis (and grade yield) is from the the proposals and reports, and final poster+presentation, rather than the actual software
- It has to be an interactive application, having “non-trivial” analysis and visualizations.
The good/the bad
Pros
- You get to learn a lot of different technologies in this course
- Because of the breadth of problems and tools thrown at you, you really get over the fear of doing something completely different and new just because you never felt that you were suited for that kind of thing. Like allocating several distributed computing resources on cloud services so that you can run your algo that sifts through large datasets.
- No exams
- Lenient grading, on the project
- Practical and hands on homeworks, that walk you through so many new things, that it will feel very rewarding to complete them
- Given the granularity of grading so well defined, at project and homework level, you can very easily choose to put in as much effort as your intended grade.
- Responsive TAs, and professor often interacts with students on Piazza
- Very interesting subject, has a little bit for every kind of person
- Lectures are well made
- Get several chances for bonus points, you can score ~105% in total
Cons
- Due to the amount of things to be done in the homework, the descriptions are often plagued with minor typos/errors, and ambiguities. This is a damned if you do and damned if you don’t kind of a situation. Too many things to do (and to accurately describe what to do) means too much text necessary to completely cover everything. But time and practical constraints would force you to limit those to 15 page/homework, and thus will have tasks that are often vague, and some that are too detailed.
- The above leads to too much (way too much) effort to fully flush out what you exactly need to do (autograding or unit test like tools are not available)
- The reason for the too much effort in previous point is the amount of students (~1000) in a single Piazza section. There is just too much noise. Before even you have started your homework (thinking I am gonna be up and early on this) there are already 100+ posts on each sub-section of each question on piazza.
- And if you don’t actually take the effort to go through hundreds of repetitive posts on all questions you will MOST DEFINITELY miss important points to cover to actually get the grade for the question. Unsurprisingly things are tricky at this level of education. (of course for those that would like to achieve 100%).
- Project “Grouping”… somewhat.
- Lectures were underwhelming, not in quality, but in quantity. They were few, and did not cover as much of the “analytics” and “visualizations” as I had hoped.
Well, where to begin? And how to add anything to the other zillion reviews here that have all mostly said the same things?
I should start by saying that I took this as my last class. This was very much intentional as I know that I am sub-par at coding and that this class would be a real grind. My thinking was that by taking it last, I’d have strong motivation to persevere when the going got tough.
Ok, so you all know by now that the class consists of four homework assignments that will take a significant amount of time and that there’s a group project that will also take a big chunk of time and involve all the usual things that group projects involve. Assignments will consist of a potpourri of languages and tools (sql, python, d3, spark/scala, tableau, azure ML, etc.). Everyone knows this and it’s included in literally every review. So I’ll try to add something unique, I hope.
I will address those who are intimidated, like me, in taking this class for fear that they don’t have the coding ability necessary to really succeed. This was me. I got a C in 6040. After that class, I feared I wasn’t going to make it through the program. But hey, I’m graduating in May…
Coming into DVA, I felt I’d probably be in big trouble. Reading some prior reviews where those with significant development experience refer to the class as “a lot of work” meant that, for someone like me, it would be a WHOLE lot of work. I mentally allocated 4 hours a day, 7 days a week. This turned out to be little more than a fantasy as work and home life quickly began to eat away at my school time and I began to fall behind.
Homework 1 contained one problem that was 40% of the grade, then, like 4 or 5 other little problems. I could not for the life of me get that first problem to work, and just punted on it. Somehow, was able to get the rest done reasonably, but still only got a score in the 50’s due to missing that first problem.
Homework 2 was the famous D3 homework, where, again, I shipped a nice 50ish on it. It was here, gentle reader, that I realized three fundamental truths at the exact same time (no, I’m not a girl in a world in which my only job is to marry rich)…sorry, been listening to Hamilton soundtrack…I digress…where was I? Oh…,right!
Cue Morpheus voice/meme: “What if I told you that you don’t need an ‘A’ in this class?”
That’s right. You don’t even need a ‘B’ in this class. Heck, you don’t even need a ‘C’ in this class. People have graduated the program with D’s on their transcript.
Second, the project is graded leniently. If your team checks off all the boxes, you will likely get an A on your project. And since that’s 50% of the course grade…
This means you can take a deep breath. Do your best on the homework, learn as much as you can, and even if you don’t do great and turn in some 50’s, you’ll be fine.
About that project.
Again, this advice, and indeed this whole review, is for those who are not so savvy with coding.
Knowing that I wasn’t doing so hot on the homework, I really focused on taking care of my project team and my responsibilities. Even to the point of prioritizing it over homework.
In joining a project team, be up front and honest about your abilities (or lack thereof) and that you will be more than happy to handle any and all reporting. It turns out that the report writing/research and so on are at least as significant an amount of work as the coding. Believe me.
Particularly if the coding is shared by two (front end/back end) someone will have to do the proposal, which may involve hours of finding and reading through 15 to 20 research papers, then synthesizing into something approaching a paper of your own. Finding that first 5 research papers is easy (remember, they have to be a certain length, and peer reviewed, etc.). The next handful things start to get rough. By the time you’ve sourced 20 of them, you realize that you’ve by no means gotten the “easy job” and that this is a lot of work.
There will then be a proposal video presentation, a progress and final reports. Finally, a project poster. Again, this all will take at least as much time as any technical work. Volunteer for this duty if you are uncomfortable with the coding side.
Other project advice: designate someone as a PM from the beginning. I don’t have to tell you that a good PM has to be the right personality, but having someone make sure the trains run on time will be important. I had a pretty good team (thankfully, we had an OMSCS guy on it). But, at times, I felt that meetings lacked a little focus and communications could have been a little better. A good PM would have been useful.
Anywho, I had a final score in the high 70’s and ended up with a ‘B’, which I was more than happy with. I enjoyed challenging myself with the homework, and didn’t get too stressed out on those sections that were a bit too time consuming for me. I enjoyed HW3 the most, and thus completed more of it that the others. But, with a job in the health insurance industry, work got a bit too crazy, so I had to make some ‘life optimization’ decisions and just do my best.
Bottom line, if you are that biz track student who struggled mightily with 6040 and fear that this class will do you in, you can make it, but you might not get that shiny new ‘A’ and have to settle for less. Just look at the class as a way to challenge yourself as much as you can to YOUR ability, be a good teammate on your project, and you’ll get through it.
I think there are quite a range of opinions on this course. I can agree with some of the negative comments about the detailed homework assignments and how there were errors that were only corrected in Piazza posts that could easily be missed, and it was difficult to get real help from the TAs - partly because the assignments were so involved. I made it a habit to CTRL-F through the Piazza posts, and I definitely found useful nuggets that helped get me through. Piazza in general is not my favorite, but I’ve learned how to get what I need out of it.
That being said, I do feel like I learned a lot in the course out of necessity. I felt satisfied completing the homework assignments because they were hard and pushed me to learn the programming concepts and syntax. I watched a lot of YouTube/Linked In Learning videos on D3 and Javascript in particular, because they were totally new to me. Also PySpark and recursive queries for decision trees.
I would have preferred an additional homework assignment and no group project. With group projects, you really are learning more about group dynamics and how to deal with everyone being last minute than you are learning more about the concepts. Plus, it can be a pooling of ignorance at times because you’re all in the same boat. I just don’t think group projects are pedagogically useful, in general, in terms of the material. They’re good “life experience,” I guess. I have no complaints about my group, but we were too ambitious to begin with and then we got stuck later. However, the grading was generous so it all turned out fine.
There are no exams in this class, which was nice. It was very hands-on, which is I why I feel like I learned a lot.
I agree with a lot of the negative points from other reviews here – jumps around too much from topic to topic, doesn’t focus as much on visualization as one would assume given the name of the course, somewhat disorganized, homework questions are often phrased vaguely and confusingly, the lectures and resources they provide aren’t too useful and you’ll do most of your learning from Stack Overflow and finding similar examples. Still, I generally enjoyed the course, felt that I learned a lot, good practice in learning new software and language on the fly. And, I appreciate that it’s required in the OMSA curriculum in that even if people take the easiest electives possible – which still isn’t a cakewalk – they’re forced to take this as well, which IMO makes the degree more valuable since you know anyone who graduated passed this and is competent regardless of their other course choices.
It’s not easy – for me personally I’d say the second hardest course I’ve taken, after only HDDA (which is an elective and easily avoidable if so inclined). And the grading is somewhat lenient especially with bonus points, it seems most of the class gets an A…or withdraws/fails for not turning in assignments or cheating. So maybe “hard” isn’t the right word so much as time-consuming since you’ll probably get an A, which you can’t always say about many less-involved courses.
A few random pieces of advice:
- Pay very close attention to everything on the homework document, and don’t make assumptions. For instance, they gave us some learning tools for version 4 of D3.js, but we had to use v5…took me awhile to even realize this after several hours of beating my head against the wall, why is nothing working. Foolishly assuming a resource they gave us for a homework would work on said homework, silly me.
- Don’t go crazy trying to get the final few points on an assignment if you’re really struggling and doing okay otherwise. Especially the further you get in the course and know you already have a decent grade…for instance, I didn’t even try the random forest from scratch, which was half of HW4. But I aced every other assignment and knew bonus points were coming – so even without that I still got an A. And frankly would’ve been fine with a B
- Going back to the above, don’t expect your group project to reinvent the wheel. They want you to succeed here, don’t give yourself an ulcer making something that revolutionizes the internet. If you can turn in something that adheres to the guidelines, you’ll likely get a decent grade. The class average for every single deliverable for the project was over a 90%
- Have the course notes open as you do the bonus quizzes. You can CTRL+F through most of the answers and those you can’t, you can probably find on Google.
It’s not that bad. There are 4 homework assignments and 1 final project (by team). I definitely learned something from those. The lectures are not useful though. The grading is very generous. But I will not recommend taking this one with another tough class.
Horrible class. All negative feedback for this class is spot on. Please skip this class if you can. If not, then prepare for a rough semester.
One word; garbage.
Everything about this course is just pure garbage. The review below this one says it all.
Goodness gracious did I hate this class. For SO many reasons. First off, I work as an analyst for a living. I’ve been using programming for statistics for about 7 years and grew up learning how to code basic things. My other education is economics and statistics so I have only had one or two true CS courses.
The good:
-
The content in the homeworks is actually relevant and fairly useful in the real world.
-
I did learn a ton (though keep reading, it wasn’t because of the teaching staff)
-
The group project was okay. It was fun, loved my group, but hated the assignment description/rubrics/timing etc. More below.
The bad:
-
I hate to rag on other students but the TAs were CRAP. If you see the name Robert Donaldson, you’re in for a horrible treat. He’s a real jerk. They constantly contradict each other, their favorite response is to copy the part of the homework you’re asking about and send it right back to you, and they often purposely avoid responding to “help promote student interaction.” BS if you ask me.
-
The timing of the group project with the homeworks is ROUGH. You will have long weeks, especially the last month or so of the active class time.
The ugly:
-
Polo Chau seems more interested in research than teaching (but who is surprised at something like that?) He is unengaged, commented maybe 10 times in the whole semester, and let his TAs run wild without guidance. His videos are useless and mostly just self promotion. I stopped watching after week 3 because they teach you nothing.
-
The homework and project assignments/instructions. They are the worst written and organized documents for academic use that I have ever seen. You have to re-read these 14 page documents to find TINY details like ascending vs descending order, whether your notepad file should be .txt or .csv even if it is comma separated, and other minute details. Miss one of them? You lose a minimum of 2% usually, if not the whole question depending on the error. They constantly re-release versions of the homework without warning so if you do any work in the first 5 days or so of the homework window you are probably going to have to redo it. The wording is often so unclear that you have to scroll through Piazza to find if/where someone posted about it and hope that a TA answered it.
-
Speaking of Piazza, whoever decided that making one piazza thread for 3-6 parts of a question was a good idea should be fired. Those threads get so long that if you click a hyperlink to a specific part of the thread, it can take MINUTES to load, if it loads at all. Additionally, they tend not to pin important posts so good luck if you see something important and forget to star it.
I really hate that this class is required for OMSA. It only has about 25% to do with visualization so the title is misleading, it is not really taught for analytics student but CS students, and it is an obscene amount of work and money for little to no support. Still sort of in shock that GT would allow a course like this to stay in the program, particularly the OMSA curriculum.
This is a very unorganized class with a programing chaos. I am okay with a TA-run course, but obviously lack of capable supervision. TA tends to grade like a robot, without using a central programing platform nor auto-grader that much. There are other classes where TA can run it smoothly.
About homework, only half of the homework are related to data visualization; the rest are just analytics where you supposed to take it from other classes in the program. Very early homework asked us to use a professor-created tool for visualizing graph, in which I will never expose it again. In fact, I am okay with that tool, but I think teaching us to use more generic libraries will be much more beneficial.
Class project is heavily focused on how you comply to the grading criteria of the progress report, final report, and poster presentation.
After all, it is a required course for me. I had a high expectation, but really disappointed in the quality and materials I learned from this class.
My Background: Non CS undergrad with 3 years of software engineering work experience. 1.5 years as a scala spark and 1.5 years of react & angular developing. This was my 3rd class in OMSCS program.
The assignments were generally easy for me with my spark and javascript background, but the assignments are tedious. I would usually finish each assignment within the first 2 weekends (~12hrs of work on each assignment), with HW2 (D3) > HW3 (spark) > HW4 (light ML topics) > HW1 (SQL) in terms of tedium.
I think the most difficult thing in this class are working with the TA’s and assignment requirements. A few times, homework requirements would change after they were released or TA’s would clarify/redefine assignment criteria in piazza threads. So if you submit assignments early, but don’t check Piazza everyday, you might end up missing that the TA’s updated the starter code or requirements, and you might end up losing points. You can lose points on regrade requests, so you have to weigh if it’s worth asking for the points back.
The team project requirements are also not as clear as they could be. For one example, my team lost points for not including literature review in our proposal slides as the rubric given did not say it was needed. Upon regrade request, the response from the TA was basically “other teams included it, so you should’ve, too”, and we had a 5% deduction thereafter.
Overall, using the different technologies and making D3 visualizations was somewhat interesting, but dealing with grading and TA’s was frustrating. Overall, not a hard class if you have the background, but generally would not recommend unless you’re just filling specialization requirements.
Pros: More likely than not, you get exposed to things you haven’t been exposed to that are pretty cool (d3). When you’re done with the class, there’s no better feeling.
Cons: There’s a lot, so stay with me here. In no particular order:
1.) The HW assignments were a mess. You spend most of the time getting stuff set up, and there are always errors on the HW. You have to constantly check Piazza even if you started working on assignments early, in case a requirement changed.
2.) The TAs. Maybe I’m in the minority here, but in my opinion, the worst part of this class were the TAs. I’m sure they were probably instructed by the professor to be super vague with their help on questions that were posted by students. On the real, they provided absolutely NO VALUE to me. Any time a question was answered, all they did was Google the general topic of the question being asked, and posted the first Stack Overflow article they found. I don’t mean to sound harsh, but the TAs serve absolutely no purpose in this class and were fundamentally useless.
3.) The group project was a complete bleep show. For starters, it’s a complete free-for-all to find groups, so there’s stress right off the bat, especially if you don’t know anyone in the class (which is the case for a majority of the students). The professor/TAs basically expected everyone to just post something about themselves and try to sell themselves on why they’d be a good teammate and what they’d bring to a team. I thought this was a class and not LinkedIn? Anyway, so after I lucked into finding a group, coordinating stuff with my group was a complete cluster. I would say two of the five of us did most of the work, while the other three were pretty much useless. I didn’t care enough to call them out on their crap, because I just wanted to get through this class and get an A. Maybe I handled that wrong, but the basic point I’m trying to make is that group projects are stupid and it is a complete roll of the dice on how it will turn out for you. I’m sorry, but that’s the truth.
The funny part about the TAs in this class being useless is that they were so nitpicky when it comes to grading different aspects of the project. They definitely went through it with a fine-toothed comb, which is more than what I can say when it came to Piazza and helping students with the HWs. It was almost comical some of the stuff we got docked on.
4.) The lecture videos were useless. For the bonus quizzes, just pull up the class transcript on one page and just do a Ctrl+F on each question. You’re welcome.
General comments: I thought the hardest HW assignment was the last one (#4), due to the random forest from scratch creation. The d3 assignment (#2) was definitely tough, and I had no experience in Javascript before this class. I managed to get 100% on this assignment and I turned it in a week early. Just start the HWs early, and you won’t have a problem. Don’t expect much help from the TAs (did I mention this already?).
I am SO glad this class is over.
Honestly, I find this course very helpful for me. I understand the feeling around that this course surveys too much contents and most contents stay at a very superficial level. But what I think I get from this course is, it points me to the tools/techniques for practical problems I encountered in the past or may encounter in the future. Although, I don’t think I learn much of tools/techinques from this course, I now know where to start looking if I want.
Now is the details: 4 Homework (Coding Assignment) + 1 Group Project (Coding + LOTS OF Reports + Presentation).
Each homework is just a collection of small homeworks of related topics and bundled together.
1) Different Usage of SQL 2) Build Interative HTML page with Javascript + D3 3) Different Kinds of Scalable Computing 4) Implementation of some machine learning algorithms
I have no prior experience with any of these knowledges except some basic understanding of SQL and machine learning algorithms I learnt from other courses in this OMSCS program. Each one just takes 10~20 hours to finish and for some homeworks we are spending most of the times setting up the environment.
Group project though takes a lot ot time to complete since there are a lot of reporting throughout the semester and the time is very limited. The pro is you are going work with a lot of amazing classmates (you probably will learn more from your classmates than from the course material). The con is I feel this kind of writes-up is very tedious and not very helpful.
I liked the professor and lectures were easy to get. The class could do better on reducing the “setup” for homework assignments. The assignments were “instructionous”. The group project was annoying. If you’ll take it with another heavy class you might get screwed.
I have an associate’s, a bachelor’s, a master’s, and a graduate certificate in addition to taking these classes for a second master’s, and hands down CSE6242 is the worst academic experience I’ve ever had. I can’t believe rational people made a conscious decision that this class should be a mandatory experience for all of their students to share. I mean, leave aside the problems with the quality of the instruction I’m paying for for a minute, I can’t believe any of the departments involved want their brands to be permanently associated with a class this badly designed and executed.
The course material is all over the place, and most of it has little or nothing to do with the ostensible theme of the course. It’s like the class is trying to be a mini degree program in its own right instead of teaching students how to visualize data. As a result, of all the classes I’ve taken at Georgia Tech, I learned the least about actually representing data visually from this one. They need to split this course up – or at least realize that there are already other courses in the program that teach all the topics it randomly cruises through.
The workload is so out of control that it seems like this course wasn’t built around working professionals but instead around full-time graduate students who are doing nothing this term but taking this class; I’ve been a full-time graduate student before, and I wouldn’t have had time for this course with lab work to do. It’s not just that the course design doesn’t respect your time; it’s rife with open contempt for it. Add to that that the videos provide nearly no information that pertains usefully to any of the assignments. I think you could do the entire course never watching one lecture and never notice a difference in outcomes.
Fifty percent of the course grade is a bulky group project, which is all over the place like the rest of the course and adds another big chunk of unclear, aimless, clunky work layered concurrently onto the homework assignments. The project is theoretically constructed as poorly as it is to represent pitching an idea, building it, and communicating progress and outcomes like you would in a real environment, but I work in a real environment, and it doesn’t even vaguely resemble this process. I’d be fired if I tried to use this as a template for real work.
The workload of the assignments wouldn’t be as frustrating as it is, I think, if they were at least well put together, but for all that they are often basically checklists, the instructions and the grading schemes still manage to be commonly unclear and sometimes conflicting, both in the written guidance and in communication from instructors. All of the assignments feel like they’re in alpha and we’re paying them to be testers.
I know other working professionals who are interested in graduate education in data science, and they ask my advice on whether to take this program. To date, I have steered away about $100k in tuition money from Georgia Tech solely because of what an unavoidable train wreck this disaster of a class is. I expect I’ll probably steer away maybe $1 mil over the course of the next few years. I like most of the program, but I’m not sinking my co-workers into this morass if I can put up warning signs.
The course itself has a TON of information. The assignments are thought-provoking and provide a good starting point for people who are getting into data-engineering, ML, and CS concepts.
HOWEVER, the workload of the course is absolutely excessive. This is coming from someone with experience in many of the tools used in the course (D3, DataBricks, Spark, Python, etc.)
The workload for myself was in the 15-20 hour range per week which could easily get to 30-40 hours for a non-programmer/non-CS person.
My only suggestion is that the course homework workload is dialed down a notch to allow non-CS students to exceed.
This class is absurdly easy if you can put in the work. I got an A with something stupid like 102%. There is, however, LOTS of work. Assignment 2 is especially deadly with so many deliverables using JS. My group was awesome and everyone pitched in really well. I’m a software engineer and typically like to just get lost in code. This worked really well with my teammates as we had a project manager who lead the charge and let everyone do their thing.
This class was a nightmare. They tried to teach us about 20 different technologies, and all the lectures were completely useless and unrelated. I would have loved to go into more depth on a couple programs (D3, Tableau, Databricks, etc…)- now I am only at “Hello World” proficiency in all of them. The fourth homework has absolutely nothing related to Data Visualization- its just building a random forest from scratch in python (which you can google and find over 100 tutorials of by yourself). I spent more time in this class trying to install and configure software than actually learning. The vast majority of assignment problems are run from the command line or with Powershell.
However, there is a ridiculous amount of extra credit and group work- I averaged ~35% on the homework and somehow got a B overall.
This was my second course in OMSCS. I’ve taken a Data Visualization course and a Machine Learning course in undergraduate, and my background is in Statistics.
- Lectures: I stopped watching lectures halfway through the course. They’re not really related to the assignments and if you manage to find the pdf slides used in on-campus lectures, the quizzes shouldn’t be too bad. I’d recommend watching them nonetheless.
- Assignments: The material in DVA was not incredibly difficult for me, but the assignments were tedious and time-consuming. Like many others have mentioned, the assignment require you to learn many different technologies, but merely at a “complete this tutorial” level. I’m not so great at planning, but I managed to finish most assignments in a weekend. Assignments are usually due Friday night AOE time, but you get 2 days of grace period for each assignment. Another thing is assignment writeups are subject to change so if you finish the assignment the weekend after it comes out, there’s a very real possibility that it’s changed at least a little bit before it’s due. The changes are usually small but it’s annoying to go back and check.
- Project: The project is 50% of your grade, completed in teams of 4-6 people. My team was 4 people since some parts of the project were scaled based on the number of people on your team. It seems like the grading was not so much based on how well your project was done, but more on how well your reports are written, and how much you were able to analyze based on your results. It was easy to forget about the project when there weren’t deliverables due that week, and my team wasn’t able to complete some of the features we had proposed initially as we ran out of time at the end.
- Course staff: The TAs are very active on Piazza and have a dedicated Slack workspace for office hours. The professor also make posts on Piazza but only for logistical announcements. No complaints here.
For Spring 2020 specifically there were many adjustments made due to COVID-19, including pushing back due dates/extending grace periods, giving “wellness points” to assignments, and canceling the poster presentation part of the project completely.
Overall I had an okay experience in this class. Assignments were annoyingly long sometimes but I did get exposure to a wide variety of tools and technologies. Collaborating with three other people across the US was not easy but we made it work. It’s manageable if you plan ahead and spread the work out evenly.
Bad course
I took it as my 3rd OMSA courses together with an MGT course
Lecture videos are useless and they teach you how to design your plot instead of the techniques
Homeworks contain too many different fields but they do not connect with others well, as a result, I need to spend a lot of time to learn many basic ideas of HTML, Javascript and JAVA but cannot get deep understanding of them
Ironically, due to corona-virus, TAs give many bonus scores to the homework and I finally get an A. But those facts cannot stop me giving a negative review to this course. I think it is a good idea if the Prof. could reduce the the width of the course, focusing on D3 library for webpage-based interactive data visualization.
This is my first course in OMSCS and was the best choice that I could have made. I m out of college for many years now and I could not imagine myself taking exams to start with. So this course fitted perfectly where there were only quizzes worth just 3%. Rest is all coding. So if don’t like coding or want to avoid it this course is not for you. In addition to the reviews below, I liked this course for the following reasons
-
There were no ambiguity in any of the HW instructions. The instructions included right from set up to what is expected as part of the solution. If you stick to the instructions to the dot, you are guaranteed with the grade points. It also included test scripts to check folder structure of the final submission so people do not unnecessarily loose points due to mere spelling issues in the file names etc (most of the HWs are auto graded).
-
The rubric was very clear with the project as well. That helped bring together every one in the project team to the same page and quickly get started on. As long as you follow the project guidelines, you are good with scoring points.
-
TAs are very helpful for the most part. I believe each TA takes lead on a question from the HW and spearheads answering questions on the piazza (this is just my observation). All the TAs were on the same page and would refer to the head TA in case of any ambiguity. This was awesome and surprising (as I had read it other way round in the reviews from previous years). They are also extremely helpful in clarifying things and pointing to resources from the course as well as the internet without giving away the answers. All you have to do is open up and feel free to ask questions during office hours (I believe that’s where you reap benefits from this program/course)
-
The program was structured a little towards mimicking real world job in terms of group project where you may not expect everyone to contribute equally and still be accountable to deliver in order to gain points. If you end up in a team where members aren’t as interested or involved in scoring high then your grade is at a toss for sure (since project accounts for 50%).
The only thing I did not like with this course is the lectures. They are too basic and high level. If you already know a topic you would appreciate the simple terms that the professor has used to deliver the crux of it. But you cannot really learn from it. I always preferred watching the lectures after submitting the assignment (except for the HW4 where you need to watch the lecture in order to be clear on what is asked in the question) just for taking the quiz. I have heard people were fine with answering the quiz just by googling it but I don’t think it will work unless you know the concepts. The references given in the lectures were interesting as opposed to the lecture itself. Don’t miss on the reference links.
So definitely a great course to take if you like coding, clear rubric and strong TA support.
Except for the excessive D3 visualization in HW2, really liked course how its structured, learned a lot and developed interest during this course on Python and ML. Really liked the kick start on Big Data technologies are page ranking logarithm. Really interesting course and project added to lot of diverse learning, but did not like how some students in the group project took undue advantage and contributed very little. We just let it slide because 3 out of 5 group project students are very active and were focuses and could deliver to the tasks. Really enjoyed the learning, Only Suggestion I have is split the visualization part more to modern technical tools like Spotfire and Tableau.
Other students have created an excellent review of this course for this semester. I will add some extra information to their critiques. This course is like hell week during the first week of football practice. It’s like the first week of cross country, water polo, or track practice. It’s hurts but you know in the end it will make you better. If you spend at least 20 hours a week then you will be fine.
It’s a LOT of work. I had issues juggling time to complete the homework and make progress on the team project. I started the homework as soon as it was available. Some times I had to stop in the middle of the homework to help complete a team deliverable. This made me behind a few days on the homework. I took some vacation days from my full time job so I could catch up. Thankfully, I never submitted a late homework or team deliverable.
Here are some of the personal highlights of the class.
Other students have created an excellent review of this course for this semester. I will add some extra information to their critiques. This course is like hell week during the first week of football practice. It’s like the first week of cross country, water polo, or track practice. It’s hurts but you know in the end it will make you better. If you spend at least 20 hours a week then you will be fine.
It’s a LOT of work. I had issues juggling time to complete the homework and make progress on the team project. I started the homework as soon as it was available. Some times I had to stop in the middle of the homework to help complete a team deliverable. This made me behind a few days on the homework. I took some vacation days from my full time job so I could catch up. Thankfully, I never submitted a late homework or team deliverable.
Here are some of the personal highlights of the class.
-
Learn about graph data structure. Wow! This was my first exposure to graphs. I can see its application in future professional and personal endeavors. I’m grateful for the opportunity to learn and apply it. The only regret is that we used the Lego data set. I wished we could have applied the homework assignment to LinkedIn, Facebook, or Twitter data set.
-
Learn about SQLite. I have tons of experience with SQL databases: Oracle, MS SQL Server, Apache Derby, MySQL, Sybase. I’ve never used SQLite before. What is nice about SQLite database is that it is just a single file. It also has full text search capability. This feature is what separates it from other SQL databases in my opinion. Unfortunately, it’s support for date is poor but there are work around such as using epoch time.
-
Learn D3.js. I’m a full stack developer with extensive JavaScript programming (React, React Native, jQuery, ECMAScript 2017) . Strangely, I’ve not used D3.js professionally. It was fun to learn about this library. I love JavaScript coding. It would have been brutal had I not known JavaScript prior to taking this class. You have been warned.
-
Learn Hadoop. Although I’ve used Java for over 15 years, I’ve never written a Hadoop Map Reduce program before. It was fun to learn about it. I knew that Hadoop Map Reduce has been passed over for other libraries but it was fun to understand what it does.
-
Learn Apache Spark and Scala. I spent four weekends prior to this class to learn Spark and Scala. I was able to apply this knowledge to the homework and team project as I was responsible for preparing our team data set. Preparing the data is probably the most time consuming task in a big data project.
-
Learn Microsoft Azure ML Studio. This was one of the homework assignment. Our team had an expert Azure ML and we were able to use this technology on our project.
-
Learn PageRank algorithm. This was very time consuming and difficult. Thankfully, I was able to implement it with massive assistance from videos on the internet. The professor stated that PageRank can be used for any graph data structure. It’s not used just for ranking web pages. Thankfully, I caught this information. This algorithm will be nicely added to by technology toolkit for future use.
-
Learn Decision Tree and Rain Forest. Dang! This was difficult. I think I spent 15 hours on this assignment alone. Like I said, the pain was worth it because I learned a lot. Our implementation must have at least 70% accuracy and finish within 5 minutes. My implementation, as graded by the TA, had an accuracy score of 77% and finished in 1.5 seconds. It is very gratifying to see hard work pay off.
-
Team project. I was very fortunate to be part of a great team. Members stepped up and took on assignments. We had a primary and secondary person for most of our team tasks. We had team members across America and one from Europe. Don’t be afraid to work with team members from different time zones. We were able to make it work. Plus, if you work for a large company, you will have offshore team members.
-
Professor and TA were awesome. They allowed students to help each other. I had one class where TAs would delete student posts if the post was TOO helpful. Thankfully, Professor Chau and the TAs, did not delete many student posts. Thank you for allowing students to help each other. As a suggestion to the students, please help each other if you can. We all have been in situations when we need help.
-
Many extra points. The pain experienced in the class was offset by the many extra points. Thank you professor and TAs.
In conclusion, this class was very time consuming and difficult for me. There were some weeks when I spent 30 hours on the class. It was worth it because I learned a lot and ended up with 97.2% in the class prior to any bonus points.
Lectures are the worst I had so far (after CP, AI4R), too high level, any explanation or interesting facts. The projects are fun, the one with d3 is awful, specially if you know software like Tableau or PowerBI.
The highlight of the course was the team project, I join with very intelligent and fun people and developed amazing stuff by finding datasets, joining them, using different algorithms and creating visualizations in Tableau, worthy experience.
I was dreading taking this class as it has a reputation of being the most time consuming OMSA course. I think I dreaded it so much and heard such terrible things that somehow it ended up being a little better than I expected. The class grades are based on 50% homework (10% HW1, 15% HW2, 15% HW3, and 10% HW4) and 50% project
Here is my advice:
-
Do not take another class with this class. It’s a lot of time, as you’ll see in the other reviews, so I can’t imagine taking it with another class in the same semester
-
Do not worry too much about the lectures. They are laughably high-level for the assigned work and you really don’t learn much. There are 4 bonus quizzes (worth a bonus 3%) throughout the course (you get the average of your top 3) that are based on the lectures, but they are open note / open internet. I copied & pasted the transcripts from each lecture into a document so I could CTRL + F during these quizzes.
-
Use the entire allotted time for each homework. Many times, it took multiple nights to complete one section of the homework, and each homework had 3-5 sections, so you can’t afford to not start right away
-
If you can, take this class with friends! I did my undergraduate degree at Georgia Tech and knew some people from that time in this program, so we all took this class in the same semester. It was really really nice to have people to take it with you, not only for the group project, but also for homework support and accountability. We were able to share good resources we found online (since you’ve thrown into the deep end with the homeworks and aren’t given many resources at all by the class lectures) and help each other set up environments & figure out the logistics of these different systems (often that was the more difficult part of the homework, not the coding itself). We could also help troubleshoot each other’s work. The homeworks are individual, so you must do your own work, but it is immensely helpful to work alongside someone else.
-
One thing that really frustrated me about this class is that not only do they not teach you the skills needed to do the homeworks, they also do not provide any answers to them! You are thrown into the deep end of self-teaching yourself a plethora of new technologies/environments and you never get to see the best way to complete the assignments. You may have written something that technically passes and you receive points for it, but you don’t get to see the “best practice” way that may have been more efficient. I understand that they don’t want current students to be able to see past semesters’ answers, but it seems like you never truly finish the learning process.
-
For help in this class, there’s office hours on slack, an unofficial class slack, and piazza. On piazza, they organize each homework to have one thread per section (sometimes with an additional thread for set-up questions for a section). I understand that they want to have organization and cut down on the amount of questions, but it makes it very difficult to keep up with. You have to scroll through to see what’s been added and something people are commenting on early questions so you end up having to read through the gigantic thread multiple times throughout the homework 3 weeks. I personally would have preferred separate threads so I could more easily read about aspects I had questions on. Additionally, with some things being unclear in the homework directions, TAs would make clarifications within these gigantic piazza threads and not always update the homework document to reflect these clarifications which hurt students who had already finished or didn’t read through the entire piazza thread.
-
Homework 1: This homework consisted of an API assignment, a SQLite assignment (pretty straightforward if you know SQL), an OpenRefine assignment (very easy), and a D3 warm-up. The D3 warm-up was immensely helpful for learning the basics of D3 so you could go right into homework 2 with some knowledge and experience. This homework may give you a false sense of security with the time commitment of this course.
-
Homework 2: This homework was 4 intense D3 visualizations as well as a very easy table creation & Tableau chart creation. This homework assignment is infamous as being the most time consuming you’ll have and it lives up to it.
-
Homework 3: Just when you think you’re past the worst of the semester, this homework is assigned - don’t underestimate it. For us, it included an AWS problem, two Hadoop/Java problems (VERY difficult), a Spark/Scala problem on Databricks (pretty easy), and an Azure ML Studio problem (very easy).
-
Homework 4: Also don’t underestimate this homework assignment - it only has 3 parts and they’re all in python (yay!) but it was a random forest from scratch (very difficult), a pagerank problem (difficult), and a scikit-learn package jupyter notebook (pretty straightforward)
-
Project: Would very much recommend finding a team in your time zone, if not a team in your location you can meet up with in person. You could use any technology in this project so our team used Tableau since we had experienced people and that was lovely to be able to do (MUCH easier than D3). Pick an interesting topic since you’ll be working on this project throughout the semester. And finally read the rubrics thoroughly - they don’t give you many pages to fit all information you have to give and it’s also easy to leave things out, so double check you’ve fulfilled all the requirements.
CSE 6242 is one of the worst classes I have taken in my academic career, it is unclear to me what the learning objectives are or purpose of the class is. The class is largely self-directed learning which is disappointing given that we are paying to be taught concepts, but you would be just as well off without any of the class material.
The class consists of two portions worth 50% of the grade each:
-
Homework
-
There are a total of 4 homeworks, each covering a wide array of programming languages and CS concepts including mapreduce, D3, HTML, CSS, Python, Scala, SQL, Pig, Hive, AWS, Azure, OOP and Java to name a few.
-
The lecture videos cover topics at a very high level and offer no appreciable help in solving the homework problems.
-
The answers to the homework problems are not published, so if you miss a question you will never get to know exactly what you missed or know the proper solution. This part is especially troubling to me as the class is self-directed and I ended up producing answers that were acceptable but that I knew were not necessarily the correct or best way to solve the problem. This creates a situation where I now “know” how to solve certain problems, but that these solutions would not be acceptable in my career making these exercises pointless.
-
Arbitrary grading policies where you will be deducted points for items that were not explicitly detailed or explained.
-
Posting on Piazza is a waste of time as posts usually go unanswered for days at a time and deadlines can be quite tight especially when project items can be due at the same time. I had one post go unacknowledged for 10 days. When posts are answered the feedback is often unhelpful with responses like “Watch the video” or “Read the tutorial” being the default.
-
-
Class project
-
Typical problems of class projects that were amplified by the fact that this is an online program.
-
Grading criteria is ambiguous and no examples are made available.
-
More arbitrary grading with no actionable insights or feedback. There are various checkpoints where you must submit pieces of the project for grading and my team was deducted points with feedback like “Would have liked a little more explanation” without noting which section the comment was applicable to.
-
One big problem with the project is that the drop deadline is late in the semester after teams are formed and we had members drop right before the deadline making much of our planning moot.
-
Ultimately this class was extremely disappointing for me as I had been looking forward to it because my work requires creation of many visualizations. This class has provided me with no applicable skills that I can use moving forward in this program or in my career. Since this is a required class my advice to anyone needing to take it is to get into the Slack ASAP. For reference, I got an A in the course and have 5 years industry experience.
I felt this was a great course to start with in the OMSCS program. I did, however, feel that the projects/homework were unnecessarily time consuming and beat the bush too much. It would have been better to cover more topics with shorter assignments. Also, I fell like a lot less time should have been spent on teaching R. It seems to me that everyone in this program should be able to pick up a new programming language with little to no hand holding.
This course is by far (as of Nov 2017) ranked the worst among all OMS courses, but I still want to leave my comment, because I’m shocked by the inconsistency of OMS standard after taking 4 great courses. I registered this course after reading the on campus syllabus (My bad), when I saw the OMS syllabus, it was too late to withdraw and waitlist another course, so I decide to take it. The on campus version included all modern technologies SQL, Tableau, D3, Hadoop, Azure, Hbase, Pagerank etc.. But the OMS version is basically week 1-6: intro to R and ggplot in R; week 7 till end: regressions in R. The syllabus speaks for itself, a more appropriate name should be “intro to regression with R” or “the linear regression half of stat 101”. These knowledge might be useful 10 years ago when everything was analyzed in memory, but now… I’m almost to the end of this semester, I’ve completed most of the homework/project in slow Friday afternoons at work without watching the videos. I have a statistical background, but to be honest, the course (homework) covered nothing more than linear regression, I assume most of us coming to the program should have some knowledge of linear regression.
The only scenarios I think you should take this class:
- you need a low involvement course: eg. you are paring it with an intensive 30+ hour course.
- (more important) if you are in the Machine Learning specialization, you have to chose 3 electives from the 4 courses offered. If you have one course you don’t want (eg. BD4H), you’ll have to take all the rest of the 3.
This is a bootcamp style class where you get to play around with tons of big data technologies, ml and graph libraries. Homeworks are simple, similar to the difficulty of ML4T homeworks except that the material is all over the place ( again - a bootcamp).
HW2 took the longest of all, and it helps if you are comfortable with javascript and d3 library.
There is a group project which is 50% of your total grade. I enjoyed the experience because I had an amazing team - we had a good mix of skill sets and everyone pulled their weight to deliver something decent in the end. I can see this class being hell-ish if you end up with a crappy group.
Great course and I thoroughly enjoyed it .So many technologies, tools and platforms and quick learning and acclimation is the key. Home works are challenging. Project work was unique and really liked it a lot. I was lucky to get a good project team so the workload is shared equally. Would recommend to folks who have a good programming background.
This course is not as hard as I thought. I did spend a ton of time completing the homework assignments, but I always started them as early as I can, and I spread out the effort into multiple days. Maybe the total hours spent were large, since I started so early, I never had to stay up late. Many people complaint about the second homework, which is visualizations with D3. However, I found that interesting and enjoyed it. My hours spent on the four assignments were about 15 hrs, 25 hrs, 15 hrs, 10 hrs.
It is also important that you find a good team, like many other reviews mentioned. That can really determines your experience at this course. I was really lucky that I was in a nice team. Everyone was working hard and each of us has different specialties and contributed to the project. I don’t think any of us had to spend more than 20 - 30 hours for the project.
I finished with very high score in this class, but I don’t feel I learnt as much. For example, the third homework was about big data tools, such as AWS, Hadoop, Azure, Hive, Pig, etc. Yes, I finished the homework and got full points, but I don’t feel I know much about them and even confident enough to put them on my resume.
I agree with many others that this class is trying to fit in too many things. They should remove some and just focus on a few and get deeper in them. That way I think we will learn more.
There were very few positives about this course. It can’t decide if it’s a technological survey course or a analysis and visualization course. I think the material and supporting assignments need a bit more focusing as I spent more time trying to understand the tech going into things as opposed to the important visualization material. The group project is semester long with graded checkpoints along the way. You have to find your own team and the propose you use a system that you have to comb through all of the Piazza posts and hope you find a team.
The syllabus (https://poloclub.github.io/cse6242-2019fall-online/) does a great job at describing what this course is about. CSE-6242 adopts a breadth-based approach to teaching students Data Visualization and Analytics. You will learn many things including:
- Using JavaScript, HTML, CSS and the D3 library to create static and interactive visualizations.
- The Map-Reduce programming model and using both Apache Hadoop and Spark.
- Applying different tools and languages to solve data problems: Apache Pig, Hive, HBase, Databricks, SQL, Java, Scala, Python, Jupyter, scikit-learn, Tableau.
- Running your data manipulation programs (map-reduce jobs) both locally and on the cloud, specifically AWS and Azure.
- Cool topics like:
- Graph Analytics, Knowledge Graphs.
- Processing large amounts of data (i.e.: Big Data).
- Implementing Decision Trees and Random Forests from scratch!
- Optimizing algorithms using virtual memory (PageRank as an example).
- Best practices and techniques for creating good data visualizations.
- Overview of machine learning algorithms and ensemble methods.
- Text analytics.
- Evaluating ML models.
If you dedicate enough time to this class (about 8h/week on average in my case) and start working on homework projects as soon as they are released you are almost guaranteed an A.
While taking this course, my wife and I welcomed our second baby, we moved across states and I took a new job along with TA’ing for another class in the program and still managed to get an A. That said, don’t expect this to be a walk in the park; It’s a graduate-level course from Geargia Tech. You’re expected to work, a LOT!
For the team project, I think I was lucky I had an amazing teammate! We started four people on the team which is the minimum number of students to form a group and ended up two halfway through the project (the other two had dropped). My advice here is that you do your part and try to engage all members of the team.
The video presentation is a good learning experience because It forces you to really experience how much effort goes into preparing and delivering a presentation to communicate data via good story telling.
The course material is well structured and thought out. The professor and TAs are engaged and helpful despite the insane number of students in this class (over 1000 between on-campus and online).
If you’re pursuing a career in Data Science or Data Engineering you should take this class!
First, the good. This class did a really nice job of giving a review of a lot of tools out there that are kind of in vogue. Now I can credibly say that I’m comfortable with Hadoop, Spark, Pig, d3. Also, I thought the content of the homework assignments was great. They encouraged a lot of independent research, but not so much that it was unwieldy.
Next, the bad. Personally, I found the video lectures pretty lacking. They were almost absurdly high-level compared to how in the weeds homework got. Also, I just really hated the group project. My team did well, but it was just so difficult to coordinate in a program like this where students have a lot of different lifestyles and everything was obviously remote. Further, we divided up the work in such a way that each person focused only on one domain. So this may have been a good approximation of the workplace, but I just didn’t feel that was necessary. Really, I’d prefer to just have a 5th assignment. Also, the first drafts of the homework assignments were rough. It was commonplace that TAs would post updated versions a week after an assignment opened. Or they would post clarifications in Piazza, but you’d need to sift through hundreds of comments to find that. You would be almost penalized by trying to work on homework early, because they would clear up poor instructions later.
I disagree - in a slight way - with a lot of reviews here. I don’t think the assignments were busywork. Actually, so much of what is in the projects is deeply relevant to industry (and is remarkably up to date, using newest versions of libraries!) D3, Spark, AWS, Azure - these are things you’re likely to encounter, and having a bit of experience with them is more helpful than you may realize if this is your first time using them.
The problem with the assignments was that they were riddled with errors and conflicting requirements that could have been caught if they’ve done even the bare minimum of QA on them. Instead, they handed rushed first drafts of assignments to students, which caused enormous monolithic Piazza threads full of repeat questions. Answers from TAs were slow, and often incomplete, or conflicted with OTHER TA answers.
The content is great, but until they get their shit together, I can’t recommend taking this class.
An absolute disappointment, chaotic structure, and a complete waste of time and money. This was a class I had wanted to take even prior to joining the program, and the previous bad reviews didn’t change my stance. I wish I had listened to all the bad reviews.
The only thing I learned was to hate all the technologies it introduced. The lectures are garbage, very high level, with no help for the homeworks. The TAs are not at all responsive to Piazza posts. It’s actually very hard to work on the project at the same time as the homeworks, because the homeworks take over your life. The test cases for the homeworks are very poor, so it’s impossible to figure out if you’re going to get docked points. The instructions for everything are unclear and poorly written. The final project and all it’s associated work was due the weekend of Thanksgiving, in a giant middle finger to a majority of students.
I’m sitting at an A before the final project and I still regret taking this.
I would not recommend this as a first course to students. This course is extremely difficult and takes a great deal of time. Work is unfairly graded. Would only take one course a semester if taking this course.
This course covered various visualization and big data techniques, which I liked the most.
The homeworks were well-designed which allowed you to touch different necessary knowledge points for individual techniques. But the instructions were very confusing and had to be modified several times according to the feedbacks from the students. So I did not consider the course materials and the homeworks were hard but time-consuming since 95% of the techniques were new to me.
Regarding the project, the best advice I can give is to find a good team with necessary skill sets and having time for discussions and actually doing the work. I happened to get involved in a team with very busy work schedules. it was hard to communicate with one another properly.
I like the projects of this class. It can be challenging going through so many different technologies in such a quick pace however the grading seems to be fair as long as you work hard. I don’t appreciate the amount of busy work that just adds on top of the actual workload. Instructions of the HWs are confusing and setups as well.
Lots of people complained about this semester. I don’t have any strong feelings. Ended up with a C, but that’s because I didn’t turn in one of the projects.
A helicopter style overview on various topic in data science, including data processing, data visualization, big data tools, and ML tools.
It taught a bit on each topic, i.e. breadth coverage without depth.
I am not a fans of this style of teaching, as I will prefer a Big Data course + Data Visualization + …. in depth for each of them, instead of teaching the basic for each topic.
By the way, the project is an interesting one, and each group of 4-6 can decide their own topic and do whatever analytics and visualization as they like.
4 individual assignment + 1 group project. Various tools for different part, e.g. Gephi, Javascript + D3, Java/Scala with Hadoop and Sprak, some Python.
I believe someone previously likened this class to a survey course, and I can’t say I disagree. You will walk away with a lot of new words to add to your resume, but no in-depth knowledge. Managing the group project is more project management than actual computer science and comes with all the usual challenges that group projects generally do. The professor is very strict about the HW submissions, and if like me you upload the wrong ZIP file you will get 0%. Considering that there are only 4 HWs and they contribute 10-15% of your grade, well - there goes your A. Coming from a non-CS background I found the HWs very challenging. The project itself was not hard for me; I had already taken Regression and have on-the-job experience in Tableau. I did however, have to hustle, and take a couple of days off work to make sure the group project got a 100% and I passed the class. If this was not a required advanced core class, I would not recommend it.
Very interesting material. I love visualizations and data manipulation. This class was just too much material to get a solid understanding of anything. There are so many technologies you use for each project that you never fully have time to master or understand them. Just enough to try to pass the class.
I took this as my first course in spring 2018 and enjoyed it very much. I had opportunity to learn R studio.
This course is horribly designed, especially, for OMSA students. They expect you to learn a dozen different software tools on the fly throughout the course. The OMSA background does NOT require students to come in with anywhere near enough programming experience or knowledge to be successful in this course. It needs a complete redesign with less breadth, more depth, and more guidance materials on each of the software tools.
Unfortunately, I had to drop the course and receive a W because I was on pace to finish with a D in the course while spending 15+ hours per week on assignments. I don’t know how many complaints GT needs before they make some changes to the design and expectations for this course.
I touches various data science/ML topics but focused in visualization. So its a very good to take if you go towards ML.
The project is open ended is hard to co-ordinate and get it going. Home works are hard for me especially those uses more D3 initially for few home works.
Video made me to understand some home works related concepts and made to finish so i can say its useful and important than mere theory.
Fun course, it’s not hard if you have good teammates
This is definitely a class that you can take during your first semester as I had. This course is a survey course, it does not go into each topic or concept in depth since it is meant to lightly touch each topic. I found the projects fun and challenging. The project was quite interesting to work in since it was the first time I’ve done a group project completely online. You’ll definitely need to be on top of your homework and project to keep up.
This is by far one of my favorite classes as well as being one that required a fair amount of work and effort. There are 4 main homework assignments that make up half your grade. If you are super familiar with each of the technologies used in the assignment you’ll find that particular assignment easier then others. For example, the last assignment was based on python and I fairly good at python so it was less work than figuring out numerous homework problems in D3 (a JavaScript library). These definitely take a fair amount of time to complete each one. The final project is decent amount of work, however it seems to be graded pretty fairly and in alignment with the rubric (which can be vague to some degree). As long as you pick and interesting final / team project, find sufficient data, perform a complete operation to compute a value and present that information in a nice graphical way, you’ll be fine. The rest of the final / team project is just paperwork around that. Time intensive but not complicated. The homeworks and projects make up your grade with the four homeworks totaling 50% and the project 50%. There aren’t aren’t any tests, but there are chances for a few bonus points by taking quizzes. There are 4 of such quizzes and the best three form your bonus points. The course covers a wide variety of data analysis topics, from the best way to present data, how to process larges amounts of data locally and using (outdated) tools such as Hadoop and (more modern) tools like EC2 and compute engines on AWS and Azure. There are even homework problems about scraping data from the web using APIs. You’ll also learn some python along the way. There is no R in this class for the assignments though you can certainly could use it if you wanted for your final project. Don’t watch the videos they don’t add any value. There is room for this course to grow. The videos for a starter. Some people will complain about there being too much work or that tries to do too much, but that’s the point. There is a lot to the world of data science and data analytics. There are many tools to use and many factors to consider. There isn’t a single data scientist out there who just crunches number on a python jupyter notebook all day and claps theirs hands and is done. Stick with the course. Lots of work but you can make it through.
Didn’t learn much.
If you do not have experience with JavaScript - I would recommend you do not take this course because HW2 is all JavaScript and D3 and can be an incredible pain.
If you have a decent background in Full-Stack Development, this course will be relatively easy for you.
The Homeworks do not change and with a little research, you can find all the HWs on github or GaTech’s enterprise github. 90% of the TAs have zero clue just like you, so they don’t ever answer your questions. The professor is slightly narcissistic. HWs can have minor changes the day before they are due.
I spent more time setting up my environments or getting configurations correct than doing the homeworks - LET THAT SINK IN.
As for the project, we did it at the last minute… I mean at the last minute and we got like a 98… so don’t break yourself trying to think you need to solve the world’s problems.
All in all, this is a survey class that is partially not up-to-date on some technologies and can be insanely difficult if you do not have the right background.
The course can get really exhausting. There are four assignments, and most of the time will be spent on trying to set up properly, instead of actually doing the homework. The second focuses on D3 which, in my opinion, is a really bad visualization tool. It is a technology that will be soon outdated, and doesn’t follows the same programming logic as a normal tool would do. It would have been better if time was spent on learning the best practices for data visualization.
The material covers lots of buzzwords, “learning” lots of tools, but you end up having only a glimpse on most of them.
Half of the grade is a group project. It is not that hard, if you are lucky to find at least two partners that are willing to work. However, instructions are not clear and the rubric is ambiguous. Feedback was always late, and it wasn’t clear.
I would not recommend this course to anyone. Of the 5 courses I’ve taken in the OMSCS, this had the highest workload, yet somehow I learned the least. Save yourself the time and do this d3.js tutorial, https://d3js.org/#introduction, and watch these videos on making nice charts: https://www.youtube.com/watch?v=xGsVd_SJ2YA
Save yourself the frustration of way-too-large homework requirement documents where the TAs and professor at first simply refuse to clarify any further. When they do, they can be smug or just take so long to clarify that you’ll possibly jeopardize your assignment (since some take hours or days to run).
Not only is the workload so severely miscalibrated, but the course is staggeringly off-topic. You won’t learn about data analytics as you might professionally understand them. The course should have simply been called Data Visualization, or “How to use d3.js and some other misc CS stuff”. We spent 2 weeks building a Decision Tree… for a Data Analytics course. This would make sense… if I were taking a Machine Learning or AI course. Why did the professor have us spend 2 weeks writing a decision tree in python for a data visualization and analytics class? The problem is the class doesn’t know what it’s trying to be. Unclear why the professor has you work your butt off, presumably so he can claim he runs a tough course?
Note, this review comes from someone who got an A in the course, has a 4.0 GPA, and has 7+ years of industry experience. My frustration stems from having wasted so much time. SO, so, so much time–for months. I’m in this program to genuinely learn and this course had me run a gauntlet while learning nothing at all. Do not take it. The course makes the OMS program weaker, not stronger. I spent such a large amount of time each week in this course on such trivial meaningless tasks, that I was severely burned out by the end. And, after all is said and done, a group project worth 50% of your grade? You’ll be submitting two peices of work in the final week that’s worth 25% and 7.5% of your course grade. That’s a whopping 32.5%, and you won’t be given a chance to request a regrade. Good luck hoping you get a hard-working group and diligent TAs.
After working myself to exhaustion, the best things I learned were: how to use d3.js (a javascript library) and how to select colors well when making a bar chart. I took the course so I could fill a data analytics niche at my company and walked away knowing as much as I entered with. If you take this course, say so long to 4 months you could have spent learning useful skills.
If you like to learn, like I do, I highly recommend Machine Learning 4 Trading (medium difficulty), or High Performance Computer Architecture (high difficulty). Those were outstanding courses and show how the OMSCS can really shine.
Background
Mechanical engineering major undergraduate 2 years out so no formal CS education. Transitioned from Software Test Engineer to more Software Development during this phase of taking this course, full time job. Knowing a bit of Python helped a little on some of the homework. Courses taken so far were:
- GIOS Spring2018
- SAD Summer2018
- SAT & SDP Fall2018
Classes taken concurrently with this class: CN & DB I would say by now I have gotten the hang of the program and how it works, so I knew my abilities and what I needed to do.
Summary
Great class with a lot of breadth in learning big data technology. Lots of D3 JavaScript. Homework assignments take a good amount of time, especially HW2 that was focused on D3, but they are a lot of fun. Make sure to find a good team for group project because it can be really great where you learn a lot from others and build an awesome project, or really shitty where no one knows how to do anything and you gotta carry your team. Even if you have to carry the team, you can make a pretty simple/easy project and still get a very good grade, think as long as you submit something and paper is decently well written/follow all requirements.
Pros
- NO TESTS
- Interesting and important subjects
- Homeworks are very engaging and fun, well guided
- Lots of breadth so you have a pretty good idea of what something is or how it works when brought up in conversation/interview
- Easy A if you are diligent to do all the homework and your group does not fail you
Cons
- Can take up a lot of time
- Possibility of needing to carry your team in group project since the class is combined with OMSA, so suggestion is to find a good team :)
Tips
- Start homework early. Keep an eye out on piazza because people post the same questiosn and answers which you will most likely also have/need.
- Try not to get a group with OMSA students :)
- Not recommended to take 3 classes with full time job :D
It’s hard to pin down what this course covers, it dabbles in many different forms of data visualization and processing without going super in depth into any one area. Difficulty is very front loaded, with HW1 and especially HW2 taking huge amounts of time, while 3 didn’t take long and I was able to complete 4 in a single dedicated afternoon. I put my average workload (including project) as 15, but really it was more like 20 for the first half and 10 for the second. I tend to finish assignments quicker than average though, so I would prepare to require as much as 25-30 hours of work on some weeks depending on your knack for general programming and quickly picking up new libraries and software with little direction. Lectures rarely provide more info than an appetizer for what you’ll need to teach yourself, so I generally did not keep up with them.
The real meat of the class in the group project. If you have a good team with a complementary skill set it can be alright, but I could imagine things getting tough with a suboptimal group. Make sure your team has at least one skilled writer and larger groups can actually make things harder on you since word limits are strict and each member needs to contribute 3 unique pieces of related literature for the survey portion.
Grades are very generous. Assignment grades average high B’s and project related assignments had a large majority of A’s. There are also multiple opportunities for extra credit and the instructor lowered letter grade cutoffs by 2%. If you can handle the time commitment it shouldn’t be too hard to land a decent grade.
Overall this was an effective, though imperfect course. The material covers a wide breadth of topics, which is great for anyone who wants to get a lot of exposure to different tools, but nothing is covered terribly in-depth. The lectures are much more like springboards into self-study than they are comprehensive reviews of the topics. This leads to a lot of time needed to grasp the concepts and implement the homework assignments, which are sizeable, but ultimately not too demanding. Others in the course had issues with communication but I did not have any trouble with it.
There’s also a large course project to complete in teams of 4-6 people, mine went great and I was happy with the team and our results. If you get a lousy group then you might have a very different experience, so make sure to be proactive early in the course and secure a spot with a dependable team.
Although the homework assignments and course project were very time-consuming, there are no exams, and grading was extremely generous. I ended up with a 103% overall grade thanks to the generous scoring and extra credit. As long as you’re putting in the time and following the directions, there’s no reason you couldn’t achieve a similar result.
Previous courses: ML, AI4R Background: I am a working Data Scientist.
Do not take this course. I have rated it as very hard not because the content is hard, but because it is stacked with busy work. The course is disorganized. The projects are too long and often offer little in the way of instruction. TAs are untrained and unhelpful. Professor is aloof and may not completely know what he is talking about.
Project 3 in this class was one of my worst educational experiences ever. For this project, you configure four different big data systems only to run ten lines of code on each. The bulk of the time is spent configuring the systems. There is no understanding of big data, why it is important, nor what can be done to work with it. It is an exercise in tedium.
Do. Not. Take. This. Course.
If you want to learn ML, take the Machine Learning course, which is well organized and has a wealth of good knowledge to bestow.
This course is garbage.
Do. Not. Take. This. Course.
I really enjoyed the D3 assignment (the second homework). The rest of the homeworks were okay but in the end it is what the prof warns you about in the class description: really a lot of variety, unrelated tech stacks, languages and techniques, you will be exposed to great breadth but no depth. The group project worked out fine for my group but overall I do not enjoy group projects and do not believe in their educational value. The course is more on the laborious side, keep that in mind when taking with something else time consuming.
Find the review titled “pick a lane, DVA” from last semester. It is correct. There are four assignments. Only one assignment delves into any depth on something to the point that you feel you could actually build something halfway interesting on your own (D3.js in the second assignment). The other assignments are more or less “go do a basic tutorial on [some hot tech that shows up in job requirements], but change one of the parameters so that it feels like you aren’t just doing a tutorial.
Many of the assignment questions give instructions at the detail of “click this button to set this parameter. Then, scroll down and click Run…” At the end, sure, you’ve run a map-reduce or whatever, but you don’t actually know how or why you did it besides to finish the assignment. I don’t really feel like I could say anything meaningful about Spark, for example, other than it lets you work with big data.
The instructor mentions the philosophy of the assignments is to teach you to quickly pick up different technologies, but the problem is you don’t actually pick up any of them. Running through a tutorial that tells you every single parameter to set isn’t picking up a technology. It’s just running through the motions.
All this said, the D3 stuff was interesting. I made a couple of cool graphs. If I do a personal project down the line, I might use D3 for something because it’s really nice to be able to show something visual.
Does this mean you shouldn’t take the class? I don’t know. It just is frustrating in some respects. I wouldn’t take it if you are not a ML-spec though. Even if you are, I’d think twice about it.
This class just tries to do too much. This class covers so many subjects that nothing is covered in a detailed way. I came into this class with some knowledge of the subject areas (AWS, Map Reduce, Sklearn) and didn’t find the homework to be difficult just time consuming. Be prepared to have to sink a ton of time into homework assignments and and after you finish, question how much you really learned for all that work.
The group project is another huge time sink. As a group project, be prepared for all the typical frustrations there.
I ended up learned a fair bit in this class but didn’t feel like I was taught much. The class doesn’t really do a ton of teaching it just points you to other references to learn what you need to learn which made me question at several points why I am paying for a class if I am just going to teach myself.
An interesting course that tries to do just a little too much. Most of the homework wasn’t hard but was fairly time-consuming (homework 2 especially!), even for experienced developers. Lectures take up very little time and provide a brief overview of a topic. Assignments and the class project are graded leniently, so you should get an A if you do the work.
Overall, a long work course that is just “get through it”. There’s a lot of potential in this class that was just not realized in the Spring 2019 semester.
Here’s the breakdown:
The GOOD: There is significant exposure to a variety of technologies across the board, from machine learning/cloud cluster computing in Azure and AWS/etc. (the aforementioned “Data Analytics” part of the course) to data visualization with d3/Tableau/etc. (the aforementioned “Data Visualization” part of the course). After doing all of the homeworks, you can probably understand what it’s doing under the hood and not sound like an idiot when questioned about why it’s on your resume under the skills section.
The BAD: This was a beast of a course that simply tries to do too much - the course title is “Data Visualization and Analytics”, but has elements of “Data Visualization” and “Data Analytics” and four massive homeworks trying to accomplish both. The end result is a Frankenstein’s Monster of homework problems each taking 5-10 hours introducing completely different concepts.
The course does so much that you get one homework problem worth of understanding, but then immediately pivot over to something new and you’ll never use the previous question’s concept EVER again.
The lectures are good for nothing except for about 3% of your grade, and they’re extra credit. They’re short and interesting enough, but google is more helpful than the lectures.
The UGLY: 4 homeworks AND a group project. The total time commitment would have been roughly 200-250 hours over 4 months, depending on the time spent for homework and project. Obviously if you are an ace coder, the amount drops significantly. However, with just CSE-6040 and some basic coding chops, the estimate is fairly accurate.
Too much busy work, not enough learning
There are 4 homeworks that together account for 50% of your grade, then a group project that accounts for the other 50%.
I did not enjoy the homeworks. They each consist of around 15+ pages of config/setup instructions, and multiple multi-part questions, each of which links to more pages of instructions on setup. There is nothing intellectually hard (some of the homeworks are literally to just do the tutorials on AWS or Azure), but getting the configuration set up and copying/pasting the correct commands in the right order is frustrating, and then working through the vast and confusing/conflicting grading rubrics takes a long time. The homeworks are very superficial, and cover a lot of buzzwords. I honestly can’t even remember the names of all the technologies we were meant to have learnt, but I now know enough to say that Pig, Hive, Hadoop (and maybe something called Spark?) are something to do with big data. There you go - you’ve just learnt as much as I did over the course of the semester, that’s a freebie.
The group project is not fun. The guidance is minimal (“go and create something that uses a lot of data and has an interactive visualization, no we can’t give you any examples”), and obviously it’s a group project. It seems that the professor is attempting to squeeze the on-campus format of the course into the online version, but a group project just doesn’t fit the online offering very well. Your experience will depend highly on playing chicken to see which member of your team actually does the project - if it’s you, then be prepared to put in a lot of work. If it’s someone else then congratulations, you get 50% of your grade for free.
This course could be great if the group project were removed, the homeworks restructured to be more focused, the curriculum updated (removing d3 and maybe Hadoop), and the lectures lengthened and improved.
From the viewpoint of a college econ major, this is a pretty tough course. Lots of different content with an attitude of learn by doing. With more time, more direction would have been nice but I can’t deny that I learned a lot in this course. I think that Polo is a good high level lecturer and that the homework assignments are well designed and offer clear direction.
I don’t really agree with having the final project count for half of your class grade since it might only incorporate a few elements of what we learn from homework/lectures. I will warn that managing the project on top of homework can be stressful. Plan ahead.
Homework on average took me sometime between 25 and 35 hours each. We have about three weeks for each assignments, so time dedicated to homework is roughly 10 hours a week with a good degree of fluctuation.
This was (one of the) first semesters of the new DVA. Also my first class in OMSCS. The class was considerably more difficult than expected. Some assignments took only a few hours, one (assignment 3) was on D3 and javascript, which I did not know at all. It took me something like 46 hours to complete, but was thankfully, by far the worst. It’s a broad class but not very deep, I learned a lot, and loved the class.
I loved this course, it was a nice introduction into R and data analytics.
I picked DVA because I had some familiarity with big data systems from my day job, and thought it would be interesting to get a better theoretic grounding and some more hands-on experience. Boy, was that a mistake.
There are four homeworks, each counting 12.5% towards the final grade, and one group project, which counts for 50%. The homeworks are the most tedious assignments I have ever had the displeasure of having to do. The descriptions are generally 8 or 9 pages long, with 4 or 5 “questions”. For each question you need to start completely from scratch, and it often involves spending hours setting up an environment in order to tweak two or three lines of code. It’s a bit like they cobbled together 30 different online tutorials, and make you recreate all of them. There is 0 creativity in the homeworks; you just have to mindlessly follow instructions for hours on end. Out of the 20 hours I spent on each project, around 10 minutes was spent actually doing something interesting (like writing a map reduce algorithm). The rest was setting up environments or following mundane instructions to the letter, so as to not lose points.
The lectures also don’t save this course: they have very little substance, cannot be downloaded for offline viewing, and help next to nothing with 99% of the tasks and projects. You may as well not watch them.
As for the group project, well, that depends on the group you end up with. I enjoyed that part of the course. At least it required some creativity!
This class is garbage. Don’t waste your time
This course was reinvented this semester and I liked. It has a lot to improve yet, but it’s not as bad as I was expecting (due to the previous bad reviews)
The course introduces you to a lot of tools, techniques and buzzwords from the big data and data visualization fields. It is a broad survey over many distinct tools. You get to know many things in a very short amount of time.
There are 4 very pratical assignments that makes you experiments many tools and a big team project that you need to create a complete “big data and visualization pipeline”
I recommend getting some ML course before (like ML4T or ML), so you can focus on the tools not on ML concepts. Also, knowing javascript beforehand can be of a great help.
Overall DVA was a decent course and I learned enough. It was very set in the R way of doing things and I used mainly the tidyverse to get everything done with no issues.
It’s my second semester in the OMSA program. The course was very difficult because I had to learn so many things that I had little experience with. I was taking another class at the same time but spent most of my time on this one. If you don’t have a computer science background or are not familiar with many of computer languages and environments, it’s going to be a very time-consuming course.
It had only four homework assignments and a group project, no exams. It did provide 4 short (10 minutes) and easy bonus quizzes (top 3 used) to improve the final grade (worth 1% point each). So at least we didn’t spend time preparing for exams, which often don’t help us learn or assess our learning.
The professor was very engaging and deeply cared about the course and students. He was very responsive and communicated personally to students. The TAs are generally good and but some were not responsive. There were 500 students in the class. They did a good job organizing questions and responding to questions on Piazza. It was a continuous improvement process – things were getting refined during the course. Obviously, the professor and TAs put a lot of effort into the course, which I appreciate.
The homework instructions were ~ 15 pages long and included links to additional instructions or online resources. So much of the learning was outside the edX video but reading materials in the homework. Much time was spent understanding the homework requirements and setting up the accounts, environments etc. There were many assumptions as what you already knew. So if you were really new to the material or concepts, you could be lost and would have to look up information elsewhere, which was very time-consuming and frustrating, especially when the instructions were not well written and the TAs were not responsive.
The problems were not hard. But setting up the environment and learning the basic of a new language or tool took much of the time. So we got a taste of many things but would need more practice to feel confident about our skills. If they split the course into two or more topics, we could go deeper. But I understand the compromise.
The project was not too hard, at least for my team who were very engaged and skilled. There was a lot of work, not enough time. It took a while for us to decide on the topic and scope. There were too many specific requirements for the project, similar to writing a research project proposal and paper. While I see the value of the requirements, it’s not realistic to truly benefit from them in a short period of time. The grading of the project was generous, especially for the final report. I would recommend reducing the weight of the project from 50% of the course grade to maybe 30% with a correspondingly smaller scope and fewer requirements.
Overall, despite the issues, I learned a lot and liked the course. However, if the homework instructions were better written, I could learned the same with far less time.
As a non-CS undergrad major taking this as a first course the workload was much more than I was expecting. The first HW wasn’t too bad but as the course got further and further into D3 using JavaScript, I couldn’t keep up. True, I could have survived with some better time management skills, but as a first course I definitely did not schedule myself properly. Lectures are a series of short videos and it is mostly teach yourself. There is a semester-long group project with many checkpoint due dates. The TA support is very good on piazza. I had to withdraw from the course halfway through.
UPDATE: after taking several other OMSCS courses I have decided I strongly dislike this survey-style, busy-work heavy type course.
Took Fall 2018 with the new format, and it was a lot of work. The 4 assignments are very time consuming. Fortunately, there are no exams (aside from some bonus quizzes, which count for a bit of extra credit), but there was a very large group project that took up 50% of the grade. Because the class was combined with the OMSA program, there were a lot of students and the TA’s felt very busy and would often be difficult to get a hold of to ask questions. This class was “very hard” as my first class to the OMSCS program, since it required a lot of self-study (having had no experience with D3.js or Hadoop before). But if you have Python and Java experience, then the class is probably more manageable.
New version of DVA for OMSCS. I thought it was going to be great even with the group project given the technologies that we would use. But in the end you spend a ton of time setting up environment to do the projects. It took time to learn some of the new technologies to complete the assignments once you did have it setup but was not bag.
Essentially the class is a survey of big data technologies with a slight extra focus with how the data is presented. There are some good tips for the latter, but still most of the time is spent setting up environments and not learning anything really useful.
They also want you to from a group, come up with an idea, and decide which technologies you will use within a very short timeframe. You can change the technologies you use, as research goes, but to have this requirement when some coming into the class may not know any big data technologies prior is odd to me.
Please note that this is a review for the “new” DVA course which is shared between OMSA and OMSCS starting Fall 2018.
If you are a first semester student, read this review thoroughly to understand what you are getting into - in this course, if you do not stay on the ball, you would be thrown for a spin.
Let’s begin with the facts of the course. This should answer some commonly asked questions.
- True to its name, the course covers visualization of data and big data analytics
- On the conceptual end, it covers aspects of human-computer interaction (HCI) relevant to visualizing data, Data Cleaning, Supervised Learning, Unsupervised Learning, Graph Analytics and Text Analytics. The material is not very mathematical. So, brushing up on math is not a prerequisite
- On the technical end, you would use D3.js, Hadoop (HDFS, Pig and Hive), Spark, Python (not R) and several other tools
- Professor Chau was very involved and easily accessible via Piazza throughout the class
- The TA’ing was very good - if you ignore a hiccup here and there, it was top notch
- 50% of the grade is homeworks while the other 50% is the group project. There are quizzes for 3% extra credit. There are no exams
- You get 3 weeks to finish and submit each homework. There are 3 weeks for the project final report at the end but the project runs in parallel with everything with deliverables spread throughout the semester
- There are several office hours spread across the week and are conducted on Slack
- Project teams are 4-6 people (larger teams with permission) and self-selected. Aim for the higher number since many teams end up with 2-3 people due to withdrawals and that can result in a lot of work. Your team can have people from both OMSCS and OMSA.
Now for my opinion of the course. The course is very well organized and a must-take for anyone doing the ML specialization. The lectures are informative and the assignments are designed to give you practical experience in what you are learning, though assignments do not cover everything learned. Note that lectures are only available to OMSCS students on Canvas which means you cannot access them before the start of semester.
The difficulty of this course comes, not from complexity but, from the amount of work you are required to do. The key to succeed is to start every assignment early - every homework has a lot of setup steps and each question can require you to learn something new. For example, for a single homework, you are expected to learn Hadoop HDFS, Pig, Hive, Spark Dataframes, Azure and AWS etc. within a 3 week period - in addition to the lectures and project work. If you cannot ramp up your knowledge fast enough, this course is not for you.
D3.js merits its own section. IMHO it is poorly documented and that adds to the amount of work you need to put in. The Scott Murray tutorial is a nice beginning but takes you nowhere near the complexity you face in the assignments. If you do not have prior experience working with HTML and Javascript, it adds to the misery. Finally, I would question the high weight attached to D3 in the course when most work environments would prefer pre-packaged visualization solutions.
In conclusion, this course is definitely worth doing but to do well in the course you need to prep yourself and maintain a high tempo of work.
This is the review for fall 2018 DVA. This is my first course and I had only python skills. After finishing the course I have skills developed in python, SQL, JavaScript - D3, basic skills in AWS, Azure, Mapreduce, ML classifier - Random forest and Linux. I didn’t gain enough in depth knowledge but the course teaches you the state of the art techniques and tools in Data analytics and visualisation.
If you are bit reluctant about learning all the above mentioned tools and programming languages on the fly then I advise you to stay away from this course or develop your skills by taking some courses before itself.
Apart from 4 assignments which had to be submitted every 3 weeks, there is one group project which contributes about 50% of the credit. So it’s going to be very hard if there is no equal contribution from team members.
There is no test.
I got A grade with 92%. So this course is pretty much doable if you have the enthusiasm to learn new skills.
The homework assignments were somewhat rewarding and I definitely gained a few skills, but I found the project frustrating. It is super open ended and it was really difficult defining our goal at the beginning of the semester with a team of 6 strangers. Your time commitment is pretty much determined by how ambitious your team is. This was my first OMSCS semester, so I don’t really know how this compares, but I do think the grading on the group project was generous. (My team turned in some work that I was not proud of and the lowest score we received on any component was a 99%)
It is a very time intensive class that throws a lot of different tools and packages at you. It is great for its breadth but lacks on depth into each. It has 4 homeworks and 1 group project. Homeworks are 3 weeks long and take 20-40 hours each. While it required a lot of effort, I found the grading to be rather lenient.
Fall 2018.
Not enough motivation for the gamut of topics.
The assignments were very broad but it was really a checkbox-type class where it felt like we were covering topics “just because”.
The hardest part was parsing the nonsensical requirements on the projects. The second hardest was context-switching between technologies.
I think it was a useful class, but I don’t see how it fits into a graduate curriculum. It’s a good skill-builder, though, and definitely challenging and doable. I say “dislike” but that’s because it seemed pointless. The work itself was reasonable. It was a lot like a job, actually.
Overall, a great course! Glad they changed to the “new” format…
My first course in OMSCS.
This course was challenging in that it required learning a plethora of new languages, frameworks, and tools in a short amount of time. Some students may see this as negative; I chose to see this as positive, equipping me for the “real world” where I will get thrust into situations where I don’t have a TA to hand me an answer…
This course was also challenging because of the group project. We started with 5 members; 3 dropped the class and we completed the project with 2! But we survived. Towards the end of semester, it seems the TA’s got lazy and gave almost everyone 100% on the final report (worth 25% of the grade!!!), as the mean score was 97.7%….choose your group members wisely and do not select an English undergrad to be in your group (lesson learned…)
I was frustrated a bit with the amount of hand-holding in this class…students complaining about how “hard” it is to learn all these new languages. Well…this is a Master’s program!! The reason this frustrates me is that it totally devalues my Master’s degree if anyone can come in, have their hand held through the entire degree, and be awarded at the end. Ok, rant over.
Most of the HW was easy, just time-consuming with a lot to do, lots of i’s to dot and t’s to cross…HW’s are moving targets with requirements changing daily based on student questions or debates over ambiguity. Nevertheless, there is a LOT of boilerplate, template code where all you have to do is “fill in the blanks”…ah memories of undergrad freshman programming classes… ;)
Toggl Time Report: https://goo.gl/L3DC9Y
Each HW is granted 3 weeks to complete… plenty of time if you START RIGHT AWAY. Seriously. Start right away. Just not the day it is released.. start a couple of days after it’s released so that rather than wasting your time with kinks that weren’t figured out, you can come to Piazza and observe all your most pressings questions asked and hopefully answered.
Time consumed and topics covered/tools & languages used:
HW1: https://goo.gl/vCHHCG 29 hours
- Python/APIs
- Gephi/graph viz
- SQLite
- Javascript & D3 Intro
- OpenRefine
HW2: https://goo.gl/7oNpHs 35 hours (longest). If you don’t know Javascript before coming into this class, you’re gonna have a HARD time on this HW. I am quite familiar w/ Javascript (I would rate myself 4/5 on expertise) and you can see how long it took me.
- Tableau
- D3, D3, and more D3! (loved learning D3…very powerful and useful library)
HW3: https://goo.gl/uaAxb3 24 hours (shortest). The TA’s did an excellent job providing lots of guides on how to get up and running with AWS, Azure, etc. All I can say is USE A MAC! It will make your life MUCH EASIER!!! ssh…scp…etc. Also, spend time and dig in…LEARN how to use these tools, don’t just complain it is being dumped on you
- Intro to Hadoop (very simple algorithm). 1 Question run on a pre-config’d VM, 1 Question run on Azure
- Intro to Spark (again, very elementary algorithm). Run on Cloudbricks cloud-based env…almost 0 setup
- Intro to Pig… very easy to pick up..Google and the class slides are your friend.
- Azure ML studio…almost not worth mentioning. Beyond easy. Follow a tutorial…
HW4: https://goo.gl/hjVwdu 25.5 hours (one of the most challenging). I had 0 ML experience coming into this class, which would explain why this HW was the most difficult for me. If you’re familiar w/ sklearn & numpy it will be easier. Still…questions were quite lenient in retrospect, lots of pre-baked template code where you fill in the blanks.
- warmup exercise w/ mmap (fill in the blanks to pack/unpack binary files)
- build a Decision Tree Random Forest in Python from scratch (this was the challenging part, without any guidance or algorithm…all I can say is Google ;) )
- sklearn to test different ML algs and report accuracies
No exams, thank goodness!
This was my first course in OMSCS and also the first semester they switched to the OMSA version. I have a full time job, family, etc. My background is NOT in development which is one of the reasons I chose this course as I wanted to get my feet wet with data and its manipulation. I found the course very difficult but really liked it. I’m expecting to safely pull away with an A.
The professor and TAs were great. They hold regular office hours. Grading is fair (but follow the rubric to the letter!)
You touch on so many really cool technologies, especially those that look good on a resume - Hadoop, Spark, Azure, AWS, PIG, Tableau, even a touch of machine learning at the end. But as others have mentioned you don’t get to know any of them really well. Dr. Polo obviously knows what he’s talking about and you can tell he’s passionate about this stuff. Unfortunately, the lectures are really high level and are little help when you’re actually doing the homework. They were helpful though for the bonus quizzes which seemed to come straight from the lectures.
There are no tests but the group project is a lot of work with a lot of deliverables. I was lucky and had a great group which helped a lot.
If I had it to do over again I would: <ul>
</ul>
I just took this class in Fall 2018 and since there was no option to select Fall 2018, selected most recent available. The class syllabus is changed from Fall 2018 and it’s not the same old DVA which used R programming. There are quite a lot of skills to learn in this class from D3, Azure, AWS, Hadoop, Pig, Spark etc in 4 assignments. The homework is pretty good and gives good insight into using these modern tools and technologies. There is a group project worth 50% of the weight. I was hesitant to take this class because of this group project. We did a decent project for visualization and though it was not great collaboration and typical team collaboration issues. However, the scores we got for group projects were pretty lenient and expecting an A and I am sure the majority of the class got excellent scores with group projects. I am sure, Prof Polo and TAs want everyone to succeed in this class and overall it was a good learning experience for me.
This course tests your ability to setup an environment and rapidly acquire enough knowledge/understanding to make code changes in a rather large set of different languages within the context of data/visualization. These changes can range from trivial (adding a line of code) to non-trivial (implement an algorithm and necessary data structures within a predefined framework).
As a result, the course is heavily weighted towards breadth over depth. You probably get the most depth out of D3/javascript with almost an entire assignment dedicated to it, but for most other languages covered, you are barely scratching the surface, which is rather unfortunate.
Most of the assignments are not conceptually all that difficult, but working through the environment setup and figuring out the logic of the frameworks can be quite tedious. The group project of course is not suited to the online experience. You have to hope you luck out with a decent group that will share the work. You’ll basically want to get enough technical expertise within your group to cover the scope of building a data driven web app.
The teaching staff and TA’s are responsive and engaged, which is great.
Ultimately, the course is alright, but I didn’t really care for the survey nature of it, especially with so much time dedicated to environment setup. I think that it would benefit from focusing exclusively on SQL, Spark, and D3.
TLDR: This course is very effective in meeting its intended objectives, just make sure the objective is what you are seeking.
Some people dislike this course because it does not make you competent in any of the skills taught (expect possibly D3.js). However, the stated purpose of the course is to give a breadth of exposure to many facets of a data science project, and I think it does that well.
Most of the homework assignments could be best described as completing “hello world” in whatever tool is being presented (e.g. D3.js, SQLite, Hadoop/Mapreduce, Spark, scikit-learn, virtual memory, etc). The point is to make you aware of what all these tools are, and when you should consider using them. I appreciated that framework as I worked the group project, as I knew roughly what I needed to use and taught myself more depth as needed.
I spent a lot more time than I wanted to on the group project, but it did force me to learn a lot and start grappling with open-ended and messy data science work outside the clean environment of toy problems typically encountered in an academic course. It’s motivated me to carve out more time next semester to work Kaggle/open data problems because I learned so many things that I could not be taught in a lecture through diving into the project.
I’m a bit torn about this course; on the one hand, I feel like I learned a decent amount through the assignments, and the professor and TAs were responsive on Piazza, which was greatly appreciated. On the other hand, the group project was a massive pain in the ass, and I don’t really understand the rationale for doing a group project in an online distributed graduate program. In particular it was quite challenging when certain group members contributed variable amounts to the project, and I ended up feeling like our project frankly would have been better done by one or two people - at higher quality, likely less total time, and certainly less stress than was achieved by doing it in a group.
At times I felt like Polo, the professor this term, was being super arbitrary in his interpretation of various rules and policies. Additionally, one ding on the homework assignments is that they required lots of environment configuration time, particularly relative to the actual core concepts. On one homework, I probably spent several hours configuring the cloud environment to run my MapReduce code, which I had written in something like < 5 minutes. Very very tedious and not that fun.
Overall I wouldn’t advise people to stay away from this course, but would certainly have them going in with eyes open. The homework assignments can take a long time if you don’t go in with a great handle on JavaScript, and the group project is no fun. But it’s an okay course overall, and I’m sure there are far worse in the program (at least the teaching team is engaged and seems to care quite a lot).
Note: I’m an OMSCS student and this is the first (read: guinea pig) semester where the OMSA DVA is being used.
This is one of those courses where your experience will vary greatly depending on your past experience (both with coding and with OMSCS/OMSA coursework). Some people who are relatively “green” will suffer long hours of struggling. Many people lost points because they didn’t read the assignment text carefully enough.
I went in knowing really only python and a little bit of SQL, and I did well in the class. The projects don’t really go too in depth and provide usually only a cursory look at each technology/tool - though that’s usually enough to get you going on more complicated pursuits. Note that you’ll need to set up an AWS & Azure account using a credit card.
Be aware that the group project is 50% of the grade and intermediate deliverables are due throughout the semester. If you are bad with time management, and/or dislike group projects - you should stay far away from this course. We had a passable group of partners (they recommend 4-6), but some groups had complete meltdowns where the majority of the group dropped out of the course or just decided not to participate anymore - which is standard stuff for any OMSCS group project class. Also, if you dislike web development work, the largest project in this class is a d3.js project that takes a relative large chunk of time.
Aside from these caveats, I greatly enjoyed the class and got quite a lot out of it. The TAs and prof interactions were above average for the typical OMSCS course.
This is a beast of a course.
I learned a ton of javascript and d3. I was introduced to a wide variety of other things, such as Gephi, Hive, Pig, AWS, Azure, Python for machine learning, and others.
Each homework was epic, but we were provided the time (several weeks) to make it work.
No exams.
The group project was challenging, but doable with a team willing to “divide and conquer”.
I would not recommend taking this course with any other course if you have anything else going on in your life.
Overall, this is a survey course from a lecture perspective, that asks you to dive deep in the individual and group work.
Quick summary: A LOT of work, and much of it fairly difficult, but reasonable grading practices make the course achievable. Learned a lot as well.
Good:
Professor: Polo is a smart guy and knows what he is talking about. He has done a lot of work in the field. Real-world Application: All of the home works (which honestly are more like projects, think anywhere from 10-40 hours of work each) could all be shown in a work portfolio when finished. Extra Credit: Up to 3 points of extra-credit are available through fairly straightforward multiple choice quizzes.
OK:
Grading: While the home works were graded fairly harshly, the projects seemed to be graded leniently, which helped create a balance where if you struggle through the course you could still get a B pretty easily and even realistically get an A.
Bad:
Lectures: The lectures have very little to do with the assignments. I honestly just stopped watching them halfway through the course. Teaching Ourselves: I really felt like while the assignments did make me learn a lot, I felt like I had to teach myself most of this course.
My Background:
23 years old, recently married, graduated undergrad in May 2017. Coding Experience: Moderate (academic experience with many languages, mostly Python and R) Statistics Experience: Moderate (4 undergrad level stat courses, Intro to Analytics Modeling, Regression Analysis) Math Experience: Moderate (peaked at 2nd year Calculus)
This class is A LOT like boiling water.
Picture if you will: Each question for each HW (~4 questions per HW, for 4 HWs) is like heating up a big pot of water. It takes a lot of energy and effort for each question to come to the correct answer (setting up software/coding environments, learning many different coding languages on the fly, constantly checking Piazza 24/7 for TA corrections/clarifications and adjusting your code to reflect these changes, etc).
By the time you (sort of) figure out what you are doing and have all these moving pieces working together to get the correct answer and start making steam, you take that massive pot of saturated water with a ton of latent energy that you baked into it over 10-20 hours, AND THEN YOU DUMP IT DOWN THE DRAIN AND START ALL OVER FOR THE NEXT QUESTION. BWAHAHAHAHAHAHA.
Rinse and repeat this over another dozen times or so and this is DVA. By the way, this doesn’t include the team Project which is also adds another 100 hours to the course.
Seriously, the course really needs to focus on a sub portion of all the materials it discusses, and less on busy work. Really, I think the course could focus almost entirely on HW2 (which is Javascipt, d3, HTML, CSS) and still be a pretty work intensive course if you know very little about front end web programming (which seemed to be the case for most the class). At least then I would have the confidence to be able to place those concentrated skills on my resume. As the class stands now, I touched so many things at once at such a high level, I really can’t talk about them at an SME level within an interview.
This asks a basic but very critical question for the future design of the course: In terms of my career and the ability to market skills gained from this class, have I really gotten anywhere 300 hours later?
This was my first class, paired with IIS. I work with data and found the material very easy yet unnecessarily time-consuming (read: busywork) and learned little from lectures or homework. I picked a project I was interested in for my group and got some value out of this (doing most of the work myself, as these things go).
- Lectures are very high-level and brief
- Homeworks are either extremely detailed (D3.js - legend three px to left, etc) or all over the place (hello world on AWS & Azure using five different tools). With prior experience, they take 1-2 days at most and are an easy A with some attention to detail. The downside is, of course, you’ll also not learn much. Without experience, you may spend several days on each project to climb up the learning curves without really getting very far.
- The group project should bring everything together but student backgrounds are diverse so it’s worth selecting a group carefully. Lots of people dropped half-way through.
If you want to dive deep into data science and spend time on an ambitious project, the course gives you a platform. However, there’s a substantial risk that you end up wondering, as I do, what the point of this class is.
Pick a Lane DVA
Are you a fan of busy work? Do you want to be able throw around buzzwords at cocktail parties, but not know they actually mean? Well, then buckle up, because I have the perfect class for you.
This was the first semester of the “new” version of DVA and it was extremely frustrating because some of what was covered was very interesting, and if the class were structured differently, it could have been an awesome, valuable experience.
Rather than a thoughtful, well structured course, I encountered busy-work that passed for homework and retaining nothing. Grades aren’t out yet, but I’m on pace for an A for what it’s worth.
Many hours were spent just setting up environments only to change 2 lines of code just to prove that we had set it up. In HW #3 alone we “learned” so many things:
- AWS
- Hadoop
- Azure
- Spark
- Scala
- PIG
- Virtual Machines
- Machine Learning
Any and all of those things are potentially useful but JUST PICK ONE – why on earth would you want people to spend hours setting up each system rather than just picking one and having them learn it well?
Oh well, at least it’s over…
Took this summer ‘18 after taking ML4T and RL. It was not a good class. The lectures were pointless. The assignments were mostly busywork.
It is a nice course to start with if you have no experience with R or data analysis. The workload is ok but I have seen many people spend much more time on the project. It is easy to understand because you can always spend more time to make the model better. However, the lecture contents are very limited and you would find them not very helpful for the project.
This course will be re-worked and combined with another version of DVA. Hope the course quality will be improved.
This class is getting replaced so this doesn’t matter!
For ML track specialization , I definitely recommend this course to be the first one. It’s relatively easy but will require quite a bit of time for assignments, activities, quiz, exams. The course uses R for data visualization which I strongly felt should have been Python. It also touches linear and logistic regression in R. For industry perspective, I didn’t find it very helpful as was using R. lecture notes are not too great , not very bad either. Definitely a good start.
Warning: This class will have drastically changed since I took it in the Spring 2018 so I not sure how much this review is worth.
This course was a perfect intro/orientation into Machine Learning and R for me. The presentation of the material (Lebanon’s lectures) itself was the low point, but not terrible. I supplemented the material in this class with A LOT of outside text and mooc material. I wish I would have waited to take the updated format of the class but I would take this class again in the form it was because it was absolutely foundational to classes I ended up taking concurrently and after (AI and ML4T). Studying R alongside learning Numpy in another class helped me write better python code. Learning the basics of Regression/Classification and taking plenty of time to examine the mathematical rationale behind it aided me in projects in other classes which seem to assume this knowledge. Dr Joyner was engaged, motivating, and helpful. TAs did pretty good job.
I’ve previously taken 3 courses (CS 6310, CS 7646 ML4T) before taking this course. The most difficulty I ran into was learning the R scripting language but it’s really not that bad once you get into it. The lectures weren’t the most interesting but I got a lot of out of a few of the assignments. Specifically the logistic and linear regression assignments. The exams are a little annoying in that it’s multiple answer and the wording was a tad confusing. If you are organized with your notes leading up to the exam you should be fine. I was able to take a trip to South America while in the class, turned in my project on time and finished with a 95%
There were 10 assignments, 4 activities, 4 home works (coding, papers) and 2 projects (project 2 based on project 1). 2 projects worth 20% each, outcome and portion of code of project 1 will be used on project 2. 4 home works worth 10% each, and aligned well with the class materials.
Activities were for extra credit, and graded mostly based on completion, but also a good opportunity for learning new technics to visualize data.
Beside: 2 tests worth 10% each, open everything, internet, note, RStudio… but questions were tricky.
Be prepare to do Logistic/Linear regression down to Algorithmic level, means you will learn how to derive and reasone the whole thing, it’s rewarding but might be frustrated to work with, you have to use calculus. The visualization part was fun, you will learn several types of chart, several packages to help you effectively visualize data in R. The 2 tests were hard, tricky. but it’s only worth 10% each and if you manage to finish all assignments well, you will have a good grade.
NOTE: This structure and material might be changed in the future, professor Joyner stated Summer 2018 would be the last semester using this material.
Good class as a summer class because you can work ahead on most of the homeworks and projects. It serves as a good introduction to data analysis as you learn R.
I don’t know if the average workload per week indicated on this database is being skewed based on historical data collected from the time it had a different professor, or a natural tendency for people to report less effort than they actually put in hoping to look smarter, or I put in way more work than I needed to for the class, or a combination of all three. The fact of the matter though is that I respectfully disagree with the 12 hours/week of effort average number. I just exchanged some notes with another student (an awesome one none less), and his opinion about the course is very much aligned with mine.
First half is dedicated to learning the basics of R, which made the workload per week reasonably in line with the 12 hours reported to date (assuming you never saw R in your life, like I hadn’t at the time). However, the second half seemed to be a race to the finish line with a lot packed in juts a few weeks (at least for me). The actual specific subject of the class is all presented on the second half, and the fact I took the class on the Summer semester certainly didn’t help the cause. I felt I have put in 30+ hours (especially on the projects) per week on the second half of the course and still felt I could have delivered a better work product at the end.
I suppose that, since this is a Master’s degree course, one gets what one puts into any class and I can attest I learned a ton. And I think I also suppose that, if the objective is to just get a grade at the end that is B or better, one can probably put in little effort on it (probably even less than the reported 12 hours), and still get away with a good grade. However, if you are an avid learner who will try to acquire and absorb everything till the last ounce of knowledge offered, be prepared to put in some serious work, especially on the second half of the course. However, all in all, I would take it again for sure. I learned quite a bit about regression (logistic and linear), stochastic gradient descent. I have one criticism, which is the fact that very little attention is paid to regularization as it was very rushed through at the end of the semester and no practical activity really demanded it to be implemented (though the final exam has about 50% of all questions on it). These subjects can be seen as an intro to Machine Learning, and I believe it will help me quite a bit when I finally face Dart Vader (CS-7641 - Machine Learning). If you are also on the ML specialization tract, I seriously recommend you take DVA (before the other ML classes); just don’t expect a walk in the park eating cake if your goal is to take good knowledge with you at the end.
This course contains two parts. First half was about data visualization, which I felt was a bit tedious and boring. Second part was like an introduction to supervised machine learning, and it was not very deep in theory.
The schedule was a little bit tight, as there was a task due almost every week. Projects and assignments were not difficult but time consuming if you did not program in R before.
Grading was generous. There were even five easy activities for us to get 5% extra credit.
This was my first course, so I cannot really compare it with others. I think it is worth taking if you want to learn R and get exposed to the daily work of data analysts.
I was nervous about this course after reading the bad reviews for previous semesters, but found it fine! The assignments were ok and ease you nicely into R. David Joyner was eternally helpful and the TAs seemed fine as well. Projects were significant chunks of coding and writing but structured with a series of questions which helped. The content wasn’t ground-breaking, but was a nice introduction to a lot of useful stats concepts. Would recommend for a first course/ML elective.
TLDR: This is a graduate program, you are expected to work hard and work beyond what is provided to you. This course is an intro to ML/Data Science. Don’t take it if you took ML already.
This was my first class at GT. I paired it along with KBAI. Although it’s recommended to not take more than one class at a time, it wasn’t my first time taking online classes, so I knew how to deal with the self-discipline of handling an online class. That being said:
The class wasn’t difficult, but it was challenging. The first half of the course is a breeze (assuming you have programming experience and know basic statistics/visualizations). The second half of the course was more challenging, but it’s where you learn. We implemented Logistic Regression from scratch, had to derive the cost function (partial derivative), and analyzed datasets, etc. The course serves more as an intro to Data Science as it covers visualizing data, cleaning data, modeling data, and deriving conclusions from data, overall making you think like a Data Scientist. I don’t have a formal ML background (although I work in a Data Analytics team, so I’m exposed to it), so a lot of this material was not new for me. I do believe this is a good intro to ML/Data Science.
On another note, you will have to do independent studying to handle the 2nd half of the course. Prof Lebanon is quite dry and personally it was very difficult to pay attention to his lectures. Also, I didn’t enjoy his teaching style. I made good use of Andrew Ng’s ML course in order to understand some of the topics discussed in the second half.
Having completed only one of the extra credit tasks (called Activities), I ended up with an A. Professor Joyner taking over is probably the best thing that could’ve happened to this class.
One annoying part of this class however was how unprepared some students seemed to have been for a graduate program of this caliber (this is one of the easier courses of the program). Piazza was almost unbearable with the amount of simplistic questions that could have either been figured out independently or by a simple google search. I understand lots of people don’t come from a CS/Software Engineering background, but the amount of hand holding some people needed/expected was almost ridiculous.
Classwork was relatively simple, but working in R proved to be difficult for this newcomer. When the class took a hard pivot to logistic/linear regression, the material was more difficult, but the grading was more lenient. If you know how to use R, this class will be a breeze. Otherwise, the learning curve for the programming language was the largest hurdle for me.
There are two phases for this course - Phase 1 - From day 1 to Exam - 1 - Its good, fun and motivating[R, R and R]. Phase 2 - Here the course takes a turn and goes into ML side which - will be very hard if you have not taken any course on ML. I am from non ML side and literally had very hard time progressing in Phase 2. BTW - Dont think that the course material is going to help in Phase 2.
I went into this class with no R experience and did well. Excellent feedback was left on assignments pointing out areas for improvement. The extra credit assignments were well worth the effort and addressed topics a little outside the flow of the course. The most time consuming part was the first project which require a lot of figuring out how to use R to accomplish a number of data processing tasks. Outside resources were needed so don’t expect the lectures to cover everything. The lectures taught by Professor Lebanon, while somewhat verbose, were excellent. I had to watch them a couple of times to pick up on Professor Lebanon’s many insightful points.
Good introduction to R. Videos at the beginning were good but later on lost its charm. Topics on ML could have been done better. The homeworks, projects and activities are very good. They test and improve our R skills. I ended up using a lot of online resources like Stackoverflow in dealing with different R syntax and libraries that can help do the homeworks and projects. The course teaches on how to visualize data, pre-process data (handle missing data), feature engineering and finally Linear and Logistic Regressions. Knowing some R libraries can really help speed up coding for the homeworks and projects. The visualization part of this course is also pretty good. They teach us about ggplots and other visualization techniques through activities (like graph networks and plotting data on world/country map) which are very powerful in creating visual charts with single line of code. I combined ML4T with this course and overall managed A in both. The 2 are a good combination. DVA exams are good , open book, open internet - second exam was special - very conceptual than memorizing. I didn’t spend time outside watching the videos for exam preparation and still ended up doing well.
This is great introduction to ML (Logistic and Linear Regression). If you have any back ground in this topic then the class is very easy. Essentially the class teaches you how to use R in a very gentle manner.
There were 5% extra credit on offer.
The class has been significantly improved since Dr. Joyner took over. I had no complains as the class was run extremely well. This class can easily be done with another easy or medium difficulty class.
The projects and HWs were well thought out and lead from one to the other in a systematic manner. Finally the exams were easy multiple choice with open everything. So the class was essentially an Easy A class.
I would recommend that all new starters thinking of doing ML specialization do this course as it has been significantly improved. You can ignore all the old bad reviews.
This class used R and covered the visual presentation of results, and the techniques of logistic and linear regression.
I really liked how the class was organised by alternating easier and harder assessments - each time I finished a harder assessment it was followed by an easier assessment. This allowed my to catch my breath a bit after a difficult assessment, but perhaps more importantly it allowed me time to think about the work I had just done without charging straight into the next assessment. I think being allowed a bit of time just to ponder helped me truly understand the topics.
The staff were active on Piazza and helpful. Overall I really enjoyed this class.
This was a fun class, with a few frustrating aspects. The class was pretty basic, with the exception of homework #3 which was very calculus heavy (which I have no experience with) and there wasn’t much explanation for the math needed. For the rest of the class it was very easy, mostly just homework assignments where you had to shuffle data into different formats and then use libraries to display the data.
One frustrating aspect was that homework assignments were not graded in a timely manner. They often took 3-6 weeks to be graded. Part of the problem was that several TAs had accidents or unplanned events happen and so other semesters of this class may not experience such a long delay.
I also wish that there had been more helpful feedback on assignments, usually they just marked the number of points you missed with a short comment to what section was wrong, but the comments were usually very cryptic.
Dr. Joyner was very kind when you approached him with grading questions and there were several times when an assignment would be awarded points because there were several students that missed a single question or because something wasn’t very clear.
On average I spent 7 hours and 48 minutes per week, but there were some weeks where I did a lot more than others and the range is from 1:20 to 16:41 (that was one of the project 2 weeks).
I would recommend studying calculus (which is does mention on the course info page) before homework 3, but other than that this was a pretty easy class.
Great first course for OMSCS. After taking the course, I have a much better understanding of regression and the basics of machine learning as well as a pretty solid foundation in R.
I have an undergrad degree in CS from 10 years ago and haven’t coded much since then, and I found it fairly straightforward to keep up. The homework assignments and projects were fairly time consuming for me because I didn’t know R at all, but they were interesting and I learned a lot. Some weeks I spent 10-15 hours, other weeks I spent 0 because there was only a small extra credit assignment due. I found it fairly easy to get an A.
I watched the lectures religiously early on, but later on I found the class discussions and google to be much more helpful. The TAs were helpful, but honestly I found the other students to be the most helpful. The professor was more of a manager of the TAs than a professor, but that’s fine.
I was intimated by the requirements of linear algebra and calculus since I am many years removed from college, but it didn’t turn out to be that bad. One assignment requires you to do partial derivatives to come up with the equation you need to implement, but it wasn’t too bad to fumble through with the help of Piazza. The other assignments did not require much deep math knowledge.
It was my first course. You would better to learn R in advance since it is a mandatory for homework & projects. However, the course is very reasonable and easy to follow even though two projects are quiet time consuming. The schedule of the course was not tight so that you would take another course as well. There were two exams but it was open book based and not that difficult. If you did not miss any projects and homework, you would get easily get an A. I recommend.
This was my first course in OMSCS. Initially I was apprehensive about R and linear algebra and calculus (last time I did that was 17 yrs ago). For learning R, the course provides sufficient time to learn the language and I also heavily used the DataCamp “Data Analyst with R” track and it made the homework’s and projects a breeze. To tackle algebra and calculus, I referred to Khan Academy. Both these resources are more than sufficient to prepare for the course.
The homework’s, projects and activates are announced up front so that we can plan about them. Make sure you do all the activities since they are extra credits and help if one of the homework or project does not go well. The grading is lenient and it is not difficult to score an A. I think this is a great course to take first up in the program.
Simple course. Try to do activities as they are very simple and easy scoring. Homeworks too are quite simple project need some thinking. Try to do R before starting the term.
This review is for the OMSCS version taught by Dr. Joyner.
It seems from some of the reviews on here and comments on Piazza, that this class is very polarizing. Either you seem to hate it or love it. I personally hated it. It’s my first OMSCS course, so I have little to compare it to.
Pros:
- The class is extremely easy. The first quarter of the class is basically an introduction to R and an overview of the difference between scatter-plots and box-plots. If you can't pull an A, or at least a B, a master's program may not be right for you at this time.
- Dr. Joyner does seem to care about his students and this program. While he seemed overwhelmed at times, you could tell that he cared and wanted everyone to succeed.
Cons:
- The homework/assignments/exams are poorly written. This is especially true for both the midterm and final exam, which had quite a few questions that could have multiple answers depending on the interpretation of what was actually being asked. A few students started a mini Piazza "revolt" where they threatened to escalate issues with the class/midterm if their grievances weren't answered (ironically, they then learned that the person to escalate these issues to for OMSCS is Dr. Joyner).
- There are a lot of helpful TAs, but they never seemed to be on the same page. There were multiple instances where I would get two very different answers to the same question from different TAs. For example, multiple TAs said on Piazza and Slack that homework had to be done in LaTeX to receive credit. However, when they posted "exemplary homeworks" one student had literally hand written his homework, taken pictures with his phone, and cut/pasted the images into a word document.
- The lectures are hilariously bad. It's like they were made in some dystopian future where everyone must speak in a monotone voice and ideas can only be expressed through Word 2010 clip art. More importantly, some of the required readings have massive errors. I was pretty disappointed that this course has been going on for multiple years, and this hasn't been fixed. I spent a bunch of time trying to figure out Dr. Lebanon's logistic regression proof, only to realize (about two hours in) that it's just wrong.
Overall, if you want an easy A or want to learn R, take this class.
When I took this class in Spring 2017 there were lots of issues with the way class was run. You will mostly find negative reviews for that term. Even with all the issues I did learn a lot from this class as it was my 1st ML class. I liked overall course content and enjoyed all projects/home works. This class will prepare you for a Data Analyst role and may even be a good starting place for a DS role. If you are interested in learning R and using R for data analysis and visualization along with few ML concepts then this is the class for you.
This course was very organized and instructors and TA’s were very friendly. This class was very active on Piazza and most of my general questions were asked and answered by other students. I did however have a specific code question and sent a private post to my assigned TA but received no response.
The first half of the course teaches the basics of R and how to model data. This was really helpful because I had no experience with R coming into this course. The second half of the course teaches a few machine learning techniques. I think this is a great course to take before you take the harder machine learning courses.
The homeworks in the first half of the course were very easy and fun because you get to play with R to create all kinds of visuals for the data. The homeworks in the second half took longer but I felt they were valuable lessons to learn machine learning if you haven’t learned it already. There were 2 projects, both of which are not hard if you dedicate the time to do them. Project 1 took much longer than project 2, but project 2 builds off of project 1 and is easy if you did well on project 1 and were able to reuse some of the code. The exams were very easy if you watched the Udacity lectures and read the course readings. Exams are open book, notes, internet (the only thing not allowed is help from another person, chat boxes, etc). The instructor tried to stick very closely to the materials provided and gave us points back for materials on the exam that were not covered or phrased misleadingly.
I think the grading was very lenient on all homeworks, projects, and exams. It would be hard not to receive an A in this class.
This semester, the course was well organized and well managed. All the projects, homework and optional assignments were posted at the beginning of the semester, and the rules were very clear.
The schedule included a project every week (some of them optional and very small, but there were two larger projects) and a Udacity lecture accompanied by a short reading every two weeks or so. Plus two open-book, open-internet exams. It was very manageable. To do well, one just needs to stick to this simple schedule.
The course is R-based, but it assumes no knowledge of R, spending the first couple of weeks teaching it. There is no need to learn it beforehand.
The first half of the course could be called “Intro to working with data using R”, while the latter half could be called “Intro to Machine Learning”. It’s not clear to me why it spends so much time on machine learning topics, but it does, covering subjects such as linear regression, logistic regression, and regularization.
Interacting with professor Joyner is a great pleasure - he’s a fantastic teacher. On the other hand, the course seemed understaffed. I don’t really mind the grading taking a little longer, but I think it’s unfortunate that our inquiries about specific midterm questions were never fully addressed.
Overall it’s a solid course, especially for people new to the program.
Pros:
- Easy class to get an A, especially with all the extra credit activities and lenient assignment/project grading.
- The topic is fairly interesting and gives you a solid foundation for machine learning.
- TAs and Prof. Joyner are super helpful in terms of communications and clarifications. Great work there.
Cons:
- The class felt too easy for a top graduate level CS program
- The video lectures are useless beyond just giving an overview of the topics. Reading extra materials and researching online is a must
Misc:
- Remember to use Brent’s notes for your exams (you’ll know what they are when you are taking the course) They are the best.
Positives
I enjoyed the assignments, and they seemed to be only as time consuming as one wanted to make them (for the most part, anyway). If you’re a perfectionist and like finding out the “right” approaches to problems (eg, using the R vectorization instead of a for-loop, or tweaking your regression parameters to minimize RMSE as much as possible), the assignments can take a while.
Each assignment was written in R, and you’ll get a pretty decent grasp of the language. I would feel comfortable using it in a professional environment now.
There are five activities you can perform that each add 1 point of extra credit to your grade.
The TA’s and students are very helpful on Piazza and Slack. I can’t even count how many times I got stuck on a problem, only to find a relevant post somewhere on the community boards.
Dr. Joyner seems a lot about making a somewhat infamous class better and fairer to students. I don’t agree at all with the below reviewer from my same semester who questions his ethics. It’s clear that Joyner listened to student concerns and made fair adjustments.
Negatives
To be honest, a few of the lectures were flat out boring to me. On the other hand, others were hilarious and bordered on the surreal - almost like I was watching a Tim and Eric episode. For example, Dr Lebanon has a penchant for using a cutout of his own head to illustrate topics, like when he mashed together angel and devil clip-art over his shoulder to represent Frequentists and Bayesians arguing… all the while talking about the topic in his droning monotone. Hilarious. I don’t want to overstate it though - some of the lectures were great, and I’ll likely watch some of them again.
Lots of the assignment instructions were pretty unclear, although I’m thankful to the TA’s and other students for clearing most of them up. The midterm had some ambiguity in it too (not great for a multiple selection/multiple choice test), but to the instructors credit, they rectified the obvious ones and gave students’ credit.
Contrary to what most people say, I think this course was very poorly managed. TAs impose and enforce policies at will sometimes just a few days before deadlines and once retroactively. Some of the TAs are excellent, but you don’t know who’s grading your work and whether they decide to create a new rule. I will definitely avoid any courses with Dr. Joyner in the future. I’ve only seen a few professors that are less ethical and with less skill in management.
That said, the material is great. They are still using Dr. Lebanon’s videos and writings which have numerous typos and are unnecessarily esoteric. But you do learn a bit and nice introductory course.
While there is room for improvement, this class definitely isn’t quite the hot mess that it was in some previous semesters. Like the Fall 2017 semester, Professor Joyner once again joined the instructor team to help ensure things ran more or less smoothly. And by “help”, I mean he appeared to be “the” instructor for the course (despite what the course sign-up states, Professor Chakraborty was not involved in anything student-facing). Dr. Joyner and the TAs did a nice job keeping us up-to-date and on-track, though it’s a continued pet peeve of mine when an instructor literally ignores their students for the entire semester.
Content-wise, the first half of the course’s lectures were pretty good. Dr Lebanon’s explanations and manner-of-speaking tend to make things needlessly complicated, especially when getting into the more math-heavy topics, but nothing was too overwhelming. The second half of the course was a lot harder to follow, again mainly due to the way the content was presented. Several students posted alternate sources of information on Piazza that made the concepts much easier to understand. It’s like too much focus was placed on the theory and exact formulas rather than what it actually means in practice.
The projects and homeworks were all pretty fair, in my opinion, even though I came in with no R experience whatsoever. Some assignments were intimidating to start, but became more manageable after spending some time playing with them. The TAs also posted helpful hints for some assignments. The exams were both a bit crude, handled via ProctorTrack. The midterm had a lot of disagreement with wording and ambiguity. The instructor team did eventually address a few of the questions about a month later and gave some points back, but other questions were never clarified.
Grade-wise, it still seems fairly easy to get an A as long as you’re not waiting until the last minute to do an assignment (and almost everything is available from the first week of class, so there’s no need to wait). Four homeworks worth 10% each, two projects worth 20% each, two exams worth 10% each, and 5 activities worth 1% each. Since that’s 105%, you’ve got 15% of leeway to stay in “A” range. There didn’t seem to be many issues with grading as long as you followed the requirements.
Communication-wise is where things could have been better. The TAs were great for office hours on Slack and generally responsive on Piazza, but private posts would get ignored and it seemed like there were some topics the TAs didn’t want to get involved with.
All-in-all, a pretty good class if you’re interested in R, but the lectures were often awkward to watch and the algebra portions didn’t really add much to the course in any meaningful way.
Excellent class! I think Prof. Joyner changed couple of things and now is so much better. You can work ahead if you want to and you learn a lot along the way. I’ve never worked with R but by the end of the class, I felt pretty confident using that language to analyze and visualize data. This class has an excellent ratio of Theory and Practice.
Also, because of the practical side of this class, I was able to get a job as a Junior Data Scientist!! So, in short, you’ll learn R, Data Analysis and Visualization, Intro to ML (Logistic and Linear Regression) and hands-on experience in case you want to look for jobs in Data Science.
That’s a 10 in my book, hehe.
This is my first course of OMSCS. I have no previous R programming experience, but I do have ML experience.
In this course, you will taught for following topics:
1- R Programming Language (Excellent part! All data scientist should learn a bit R)
2- Data Visualization (This is part I love for this course, you will learn a lot of different visualization of data)
3- Preprocessing Data (Also very good, it is very important to learn how to clean dirty data)
4- Logistic Regression (Not good, they use too complicated terms to describe, even I have previous ML experience, I still couldn’t understand these lectures)
5- Linear Regression (Same as logistic regression)
6- Regularization (Not bad, but it can be better)
Again, this course did excellent job in R and data visualization/preprocess, but I completely dislike the second half of this course. If you are going to be in ML specialization, take CV instead.
(Only took this class because it was a requirement for ML specialization. Now that CV counts for ML, I would strongly recommend taking that instead.)
I basically learned nothing in this class that wasn’t already covered in the first few weeks of ML. Do you already know what a histogram, scatterplot and boxplot are? awesome, you can skip most of the first half of the semester.
I like Dr Joyner, but I think he inherited a bad situation. Lectures had a few interesting bits, but mostly took a long time to say something pretty straightforward. Homeworks/projects were tedious and had to be written in R. Graders deducted points for random things which weren’t even in the instructions. Both of the exams were poorly-written and somewhat ambiguous. (There were disputes on a significant number of the questions.) On the plus side, the homeworks and projects were released ahead of time, so at least you could work ahead if your scheduled allowed.
I took this course with Dr. Chau (OMSA). The OMSA course consists of 4 very large assignments and a semester-long group project. I believe this may be the most time-intensive course in the OMSA program, and I recommend either taking it early or as a “capstone” class to finish up the program.
The most challenging part of this course was the amount of time that needed to be put into the assignments and projects. The concepts were arguably quite simple, each assignment required learning one or more new programming languages and/or technologies. Setup of the environments required often took a lot of time too. Dr. Chau was receptive to feedback, and modified the last assignment to have one of the questions be for bonus credit, which helped it be more manageable. It’s possible the online course will change dramatically based on the course feedback. That said, it is an incredibly time intensive course and you should plan accordingly.
The project was… frustrating (to me). The assignments were so time intensive that our team struggled to work on the project early on and ended up pulling things together at the last minute. Another challenge was the difficulty in making group decisions when a group leader wasn’t established and the whole team couldn’t find time to meet virtually.
I will address the concern about using Azure and AWS with a personal credit card as stated in another review. While, yes, it was annoying and uncomfortable to provide that information, for both services we got much, much more free credit than was needed to complete the assignment, and students who did go over the free limit were able to get the charges waived by calling customer support and explaining the situation. The Azure part of the assignment could be completed on a personal computer and then uploaded to the server, eliminating any chance of error/using up the credit.
I learned so much in this class and was exposed to a large number of data technologies. I know there are some people who struggled greatly with this class, but I found it comparable to (if not easier than) my undergrad engineering courses at a top-ranked engineering school, so the quality was what I expected. I think the class does have room for improvement, and could use more TA support to improve communication and responses to Piazza. All said, I wish I had used this as a capstone class toward the end of my degree and had taken it on its own, so I could have gotten more out of the project.
I took the OMSA class with Dr. Chau. This was by far the most challenging and time-consuming class I have ever taken (my undergrad was ME from Tech). I also learned more in this class than any other class - a direct output of the fact that I spent more time on this class than ever before.
Dr. Chau is extremely reasonable and truly cares about his students. It is easy to get an A in this class as long as you spend the time. Generally you will be able to keep working/debugging code until you are certain that it is 100% correct. You have 3 weeks per HW assignment, which is reasonable as long as you start working immediately and don’t procrastinate on the group project.
My only gripe about this class is that it should be worth 6 credit hours instead of 3. There is no way that the amount of content covered and work required for this class should be weighted the same as courses like ISYE 6501 and MGT 6754 which generally are in the 5-7 hours/week range.
This course has been an incredible source of stress, anxiety, and frustration. Though I expected the work to be difficult (my courses last semester in the OMSA program certainly were), it has bordered on impossible at several points given the unreasonably complicated nature of the work we are assigned. I spent more time decoding confusing instructions and being my own IT support this semester than I did working on any visualization projects or concepts. I was very close to withdrawing from the class and if it weren’t for the group project and the fear of letting my teammates down, I probably would have dropped it.
For each homework set I spent at least the first week (40+ hours) just trying to get my computer set-up with the software/environments necessary to even start programming. In several instances this has required making changes to the firmware, multiple uninstalls and re-installs all of which are incredibly time-consuming and I think not the point of this course. My only computer is my work computer and I have felt quite nervous about how these changes I have made might impact my machine and my ability to do my job if it were to crash.
To complete a homework assignment students were required to join Amazon Web Services and Microsoft Azure, both of which required entering personal credit card information. For each of these services there was the potential of running up thousands of dollars in charges if a mistake was made. This is an entirely unacceptable risk to ask students to take. Often learning takes place when mistakes are made and to put a financial burden on students who are already paying to learn is wrong.
At several points throughout the semester direct messages to the instructors went unanswered for days. At one point a very urgent message (labeled as such in the subject line) was ignored for 84 hours. At the point it was answered it was too late.
The difficulty I have had this semester has caused me to question my future in this program. If the 40+ hours per week of work needed for this course is indicative of any future courses, continuing to pursue this degree will be incompatible with maintaining my full-time job and family responsibilities. I understand that Tech’s rigor is a part of its great reputation, and fully expected my classes to be very challenging. However, I did not expect to be subjected to unreasonable expectations and poor communication by professors in a professional graduate curriculum like the “weed-out” courses I had during undergrad.
Course was well run with Dr. Joyner. The homework, activities and projects were interesting and directly relate to the course lectures. Piazza was very active, and the TAs and other students were very helpful with questions. Great course to take as one of the first few courses in OMSCS. The first homework is R focused, so you have some time to pick up R if you have not done it before.
This was a good class as my first class in the program. You could easily work ahead if you wished though reading the piazza discussions about the material often helped if you waited till later. As with most grad school classes, you get out of it what you put into it. If you want to spend a lot of time and really get things perfect you can. If you want to spend a lot less time and do the minimum you can still do pretty well you just won’t learn as much. If you want a good way to learn R this is a good way to do that and learn some visualization things as well. Liked the projects a lot. The exams were a little tricky, but the instructor was willing to go back and look at them (at least on the midterm). Have not yet attempted the final.
First thing is to clarify that there are two versions of this class, I will talk about the OMSA version.
I loved this class! I can say it is one of the best online courses I ever did. The instructor is very passionate and very knowledgeable and the assignments are challenging but you end up learning a lot.
You will learn a lot about D3, and some bits about modeling and Big data technologies, but you will get a good general picture of what a full data pipeline project looks like.
The presentations are very enjoyable and well crafted, and you will make some friends thanks to a group project that will require at least 4-5 meetings.
All this said, this course could be improved on some aspects, one of which is time management and a bit more of hand holding. So don’t underestimate the amount of time you will need for it. There are 4 BIG homeworks, so don’t procrastinate and work steadily, if you do that I estimate around 20hours per week.
Very well managed class, interesting material for someone with no prior ML/Data science background. It was a great way to start the program
Great course! I’ve used R programming language for years but still learned a great deal about more efficient ways to manipulate and plot the data.
I took the first iteration of this course with prof. Guy Lebanon. The course is fairly easy and doesn’t require a lot of time, it’s a good introduction to some machine learning (very basic) and data visualization, which may come in handy in later courses. There were some complaints with the grading in the end, but that’s expected from first time courses.
I took this class after ML, which made it very easy as most of the ML concepts were just refresher. This is a great introductory class for R / data visuals, but not a great one to start learning ML. I really appreciate Dr.Joyner’s effort to keep the course organized, especially by releasing assignments in advance so that I could better manage my time (among other priorities). However, this course doesn’t take much time at all. The only time consuming part will be the two projects (and only if you want to get in-depth). Final exam is on an easy-medium difficulty (definitely a lot easier than ML midterm).
A good introduction course to R and simple machine learning analysis. With 5 extra credit, I think it’s pretty easy to get an A which only requires a final grade of 90/105.
A very good introduction to R and data visualization concepts. Also, a brief introduction to ML concepts. The lectures were not helpful at times; especially towards the end (Regularization etc). Projects were time consuming so make sure you have enough time in hands to complete and submit on time. Extra credits for the Activities did help in the end to get an A grade; inspite of the difficult and slightly confusing final exam (open everything). Overall, I am glad I took this course in-spite of the bad reviews from previous semesters. This time around it seemed a lot better organized in terms of communication - I believe it was the Joyner effect.
- R is intuitive & easy to learn if you have programming experience
- I got a B - thought I deserved A, missed by a whisker. The extra points of AC will help. Plan to get those
- Course content is fairly easy. Make sure you learn “Regularization” well - it is at the end of the course. The videos didn’t cover it well, I felt. I should have learnt it from elsewhere. My course concentration at that time slipped a little (Life got in the way)
- Final exam was tough
- Don’t expect too much from instructors. They were always MIA. TAs & fellow students on Slack and Piazza were very helpful.
Joyner ran the class this semester so everything went smoothly and was well communicated. The class is pretty easy if you’re familiar with R, otherwise it’s about ML4T difficulty. Basically, take this class if you want to practice or learn R, or have absolutely no exposure to data science work (cleaning data, feature engineering, simple models). FWIW I didn’t watch any of the lectures and still got an A. The exam was timed and proctored but open everything, including Google.
Course was a good introduction to data visualizations (histograms, box plots, scatterplots, etc.) and to logistic and linear regression. The course used R, which I have come to dislike mostly, at least compared to Python.
Videos are a good introduction to concepts that I had to research further. After further research, coming back to the lectures made more sense. Likewise with the readings. I had to iterate to learn the material.
I really enjoyed the homework and project. They were challenging, but applicable to further ML work.
The course instructor wasn’t on Piazza much, if at all, but there were several TAs who answered questions quickly, and fellow students were also very helpful.
Grading took forever, but this wasn’t really a problem since the homework was only minimally dependent on knowing earlier concepts, so knowing if you did well on something wasn’t a hinderance.
The final was hard. Even being open book and open internet, it was difficult to complete within the 2 hours allowed. I feel I earned my grade.
Lectures are high level and not close to what may be needed to complete the homework and assignment. It requires higher level maths. I personally feel not much was learned through this class. Also, the course content needs to be refreshed.
The lectures are ancient. They were recorded years ago by Guy Lebanon. Guy may be a smart dude, but he is completely incapable of lecturing in a comprehensible way. Also, the descriptions and pre-reqs say nothing about the higher level math you need. I swear there were symbols I’ve never seen. I did 2 semesters of Calculus over 25 years ago (as well as a semester of linear algebra). The homeworks and projects were generally OK. Homework 3 required you to code logistic regression by hand, when there are perfectly good functions built in, so that was pretty absurd. Especially since the lectures were very high level. It was basically the old game where they teach you how to draw an owl by drawing 2 ovals and then in the next frame show you a finished detailed owl. Good luck with that one. This was my first class. If my next semester is the same experience, I will be dropping out. I learned very little (I actually have taken some ML courses before) and found it highly frustrating.
There are couple of major problems with this class. 1) Professor is missing. I am not sure why his name is part of this course. This course needs to be redesigned by someone who owns it. 2) The lectures are from Dr Guy Lebanon are pretty dry, math based and doesn’t provide actual implementation basis. To me, it felt like you have to bridge the gap between theory and implementation. Without any background, it is very difficult. The course pre-requisites does not say you need ML background but you do. 3) The grading throughout the course is inconsistent, the exam fairly difficult. 4) The TAs are helpful if you know the material. They will not teach anything. I struggled throughout the course with no significant help in understanding the material. Conclusion: Don’t take this course unless the course gets redesigned under a new professor.
Took this as my first and only course in OMSCS for this semester and I would say this was a good choice in terms of work load and difficulty level as I’m working full time and it’s been years since graduated from college. This course provided a refresh on calculus and linear algebra which I guess is useful when I get deeper in ML. I also took this course to learn R as a new language, which I think the course material did a good job on that, but maybe too much as it seems like it assumes minimum background in coding. So if you, like me, are a professional developer for years, you may find it boring.
The workload is light, only exception is the first project which you need to do some partial derivatives and implement a logistic regression model using R from scratch. The requirement wasn’t clear until couple of days before the deadline, which makes it hard if you want to finish it early on you own pace. But since most of the HW and projects are due early Monday mornings, you have good chance to spend your weekend and should be able to get it done. I still remember the last project which is due the Monday after thanksgiving and I had to bring laptop with me and did some coding while on vacation in Fairebanks, Alaska. It was good memory though :)
Prof. Joyner’s presence has been helpful this semester. Apparently, one of the TAs dropped out, which has significantly slowed grading. Before they got behind on grading, the TAs and instructors were always available on Piazza. Now their presence is more like other OMSCS classes. There are lots of office hours available on slack. A few students expressed concerns about grading consistency with the first two homework assignments, but regrades are underway. The current construction has one homework that is machine learning heavy and that has been the source of the most complaints on Piazza. As with ML4T the topics are introduced in lecture, but possibly not in sufficient depth for someone without recent undergraduate background in CS theory and math (partial derivatives of vectors). Both ML4T and DVA include machine learning topics along with domain topics (pandas, stocks and R, visualization respectively). BTW, there is currently no mid-term, only homeworks, projects and a final. Overall, I’m glad I took DVA this semester. The semester is not over yet and I already managed to use some of my new skills at work.
The course material does not cover the homework assignments. The TA’s are not helpful, they constantly tell you to refer to the reading even when you say that you don’t understand the reading. The professor only sends weekly announcement. He does not answer questions on Piazza or Slack.
Interesting class if you want to see Data Science from an R (vs Python) perspective. R is an 80’s language with all the benefits and liabilities of a legacy system. The projects can be large and quite time-consuming; but not too difficult.
I can see how you would use R professionally and there is demand for the skill, but it’s not my cup of tea. IMHO : Cobalt + Data = R
I took the class with Dr. Chakraborty. I agree with the other reviews on here. The class was basically a train wreck. Inconsistent grading, combative TA’s, unclear instructions, slow and inadequate responses from the TA’s and teacher. The teacher didn’t seem engaged with the class at all. Easily one of the worst classes I’ve ever taken. For those of you who are considering taking this class, be sure that you are reading reviews of the class with the current instructor rather than the old one, Dr. Lebanon. From what I’ve read the class was much better before Dr. Lebanon left.
The assignments took me a fairly large amount of time to complete, and I used the available 48 hour extension twice (for a ten point penalty). This course taught me a lot of practical skills that I could apply to my current job and would help anyone who deals with any kind of reporting. The data pre-processing required in the projects can get a little hairy, but it’s good practice. Several extra credit assignments were given during the course.
Good course in my opinion. Starts off easy and gets harder at the end. I knew R coming into it which made everything a lot simpler. Definitely a good course to take before you take machine learning. Gives you a background in R, and touches upon the beginnings of machine learning with linear regression.
A severe disappointment both in quality of instruction and content of instruction. The course contains less than half of the material taught by Prof G. Lebanon back on 2011 while he was teaching at Gatech.
Context: I took this class in Spring 2017, as first OMSCS class, and without any prior ML experience. Before I took the class I checked the Workload estimate here, which was in my case underestimated. The workload vary depending on assignments, there were weeks with -/+ 30h, others with +/-10h. Overall, it was not an easy class. Piazza was too noisy and sometimes overwhelming to focus on the valuable posts. I believe this is common problem for all classes with large audiences. I hope the next classes will run smoother and better. More details: https://cse6242. gatech. edu/spring-2017/schedule/
This class had so much potential, but I think the switch to a new professor with limited experience running an online class really limited the class. It suffered greatly from “new class” issues, and I think the large class size in combination with the new professor and long homeworks / projects overwhelmed the TA’s. I agree with most of the other posts as it was a poorly run class. But, in spite of all the negative points, I still managed to learn (mostly on my own) about the R language, regression analysis, and some data analytics. If you are the type of person that is willing to learn and investigate topics on your own, this might be the class for you. Otherwise I would suggest passing on this class until it is cleaned up.
I’d taken this course seeing the highly positive reviews on this site for the previous semester offering. The Prof was changed for this semester and I found the course was very disorganized. Most of the students in the class seemed not so impressed. Hopefully changes will be made for the next semester.
This was by far the worst class I have ever taken–online or in person. It was an easy A for me but I would rather have had the class be more challenging and actually learned more–I don’t feel like I learned much beyond some very basic Machine Learning concepts.
The professor and most of the TAs were mostly absent from Piazza and it was like pulling teeth to get any sort of response from them. The instructions for the homework weren’t clear and it was difficult to get anything clarified. The TAs would tell us something and then the homework would be graded the opposite of what they had told us. TAs and professor would contradict each other all the time. At the time of the final exam, I had only gotten back my grades for HW1 and HW2 (out of 3 homeworks, 2 projects, and some additional extra credit activities). This was frustrating since some of the questions on the final were related to things we did in the homework and projects, and when I wasn’t sure if I had done those things correctly I thus wasn’t sure how to answer the questions on the final correctly.
I got high scores on the homework and projects but I have to wonder if it’s just because there was such a backlog of grading to do close to the end of the semester that the TAs just rushed through it and gave almost everyone As. I’m guessing at least half of the class ended up with an A, especially due to the extra credit–it may have actually been closer to 75%.
I was hoping to learn a lot about data visualization but it was mostly just a tutorial in how to use ggplot2 and a short discussion of the different types of plots one can create. The lectures were very shallow and general–I wish there had been more of them. It was probably only about 6 hours of lectures total for the entire class. It just felt like it wasn’t enough, and it also felt like the professor (who was NOT the same professor as the one in the lectures) didn’t really care about this class at all.
DISAPPOINTED!!!!!!!!!!!!!!!!
This was definitely the worst class I’ve taken in the OMSCS program.
I had such high expectations because the scope of the course looked fantastic! I was excited about learning R and for the potential opportunities I would have to apply what I learned at my workplace. But basically the disorganized, disastrous mess of this class crushed my hopes and dreams.
Things I saw in this class that I’ve never seen in other classes:
- Constant contradiction between TAs and professor
- 6-7 week wait to get a grade back for any assignment
- TAs abandoning the course for unknown reasons leaving all the work on a handful of TAs
- Zero guidance on assignment / project expectations
- Material on final exam was no where to be found in the course
At the end of this course I basically lost some interest in Data Visual Analytics. It feels like a chore to me now because of how much I really disliked doing work for this class. I’ll never forget the moment when the first grade was returned in this class. The Piazza statistics for the class showed that there were 6x the number of posts of a typical day when it was released. People didn’t understand what they did wrong. And it took over a week for the instructors to generate any kind of reasoning for the grading. One would think that a rubric should be available to handle discrepancies… but it didn’t appear that way. For the rest of the assignments, I felt like I had to perform 150% just to get a good grade on an assignment. For the second project, my report was 40 pages long. I did get an A but I honestly wasn’t confident that I had done enough.
The professor did admit at the end of the semester concerns had been heard and improvements could be made. So hopefully they will be!
On top of the disorganization, this was also a hard course in general. Here are some things to review if you decide to tackle this course:
- Machine Learning
- R (the language)
- Statistics
- Calculus
- Vectorization (specifically in R)
- Big O
This was the most horrible course I have taken in life EVER!
- Very disorganized and chaotic. Contradictions among TAs. No grading rubrics.
- TAs abandoned ship after 2 weeks and we never got any response for regrading requests. My question to the instruction is still open for HW1 and haven’t received a response after 3 months.
- Lectures were shallow while the projects/assignments expected in-depth knowledge.
- If you know ML already, u won’t learn anything beyond ggplot2 (plotting few graphs). If you don’t you are going to be in hell that term.
- Final exams had questions and concepts that weren’t taught in the course.
- I don’t think I would have gotten an A without taking few external ML courses which added a huge burden to the time consuming projects that are there in this course.
- Oh and if you by mistake looked at the course offered on-campus, that is no where related to this online course. So be ready to be surprised if you thought you are going to learn some cutting edge technologies to do real world big data analytics. This course is so 15 years behind with technologies and what is happening in the industry.
I wouldn’t recommend this course until it is revamped with new videos, content and properly organized.
When Professor Lebanon taught this course, it was excellent. A few bumps with it being a new course were far outweighed by a highly engaged class and teaching staff that made it very enjoyable. There were real-world problems and a challenging/fun exam that pushed you to think instead of just recite back lectures/material. It was useful to learn the guts of the mathematics behind some of the core algorithms (e. g., gradient descent). Some students complained/struggled about having to use high school/early college Calculus but it should be reasonable/expected in a graduate program like this one and in particular the field of ML and Data Science. R was also incredibly useful to learn and has served me well in industry.
This course is a disorganized mess! The assignments are vague, but the grading is specific. In the previous semester, Dr. Lebanon was very involved and helped students understand the assignment intent. This semester, Dr. Chakraborty was MIA most of the semester. The grading was inconsistent, the TAs were frequently combative, and the instructor’s solutions to problems were frequently terrible.
Our final exam appears to have been written at the last minute. Out of 35 questions, 3 were on material not covered in the course, 3 had issues that caused the instructor to admit they were bad, and 2 had multiple correct answers. To make up for these questions, the instructor added 3 points to everyone’s score and then refused to take any grade challenges. This solution is unfair and took many students from an A to a B gradewise. I was fortunate to not be one of them, but this class is the worst run class I have experienced out of 9 OMS classes - I am sorry I took it!
The on-campus course this spring looks like an amazing course, dealing with lots of big data technologies. Instead, we got a basic introduction to R, which I hope to never see again - it’s that poor a language choice. This class really should be taught in Python and should cover more relevant topics like the on campus course. I doubt Dr. Chakraborty will be back, but unless the lectures/topics are reviewed, I cannot recommend that anyone take this course.
Worst course ever… And I mean it. I just finished the whole master program and this IS the worst class. Fall 2016 was the first time the class was put online. The course contents were actually pretty good though quite light. The assignments and projects were also OK. However, the final exam and the overall grading was terrible. There were a dozen TAs, but no grading criteria or rubrics, while all assignments and final exams were essay like reports… This created a total disaster. Maybe because the grading load was too much, TAs did not seem to spend much time on (or even care about) our tens of pages of reports, and the grading was generous but quite random, with no meaningful feedback given. In fall 2016 they put some example assignments on piazza and after the exam, some students voluntarily shared their exams, but similar answers were given very different scores. One of the answers that got full score was almost identical as mine, but mine got 3/11. After I pointed out that my answer was similar to others but got significant lower score, the professor simply said that happens, but since mine was not perfect, the original grading was valid. So in the end I stuck at 79. 7 with a C… Additionally, after the final exam, many students found out that they were not able to see a big chunk of the final exam-a third of the exam didn’t show up on proctor track… I still don’t know how they dealt with these angry students. I filed the grading grievance and started to take the same class again, because I needed it to graduate. As of now, I haven’t heard anything from the grievance, and the semester is already ended. Spring 2017 is even worse, the instructor disappeared for the most part of the semester, and we only got 20% grading done two weeks before the final. There were several angry posts on Piazza, and some students filed grievances. For this semester, I simply resubmitted most of my assignments with a little editing. And I got higher than 100. Ridiculous.
My score is higher than 100 in this course, but I will NOT recommend it. Here are the reasons:
1) The instructor is draggy and not prepared. Can you imagine the instructor is still preparing the final exam, when the exam window is already open. 2) It takes the TAs extremely long time to response to your request. I ever sent out a private message to my TA, and he responded, after tow-three weeks. 3) The grading is very slow. Some of the projects are connected. You don’t know whether your solution for the previous project is correct, while the second one is already due. 4) I’m not sure whether the instructor and TAs communicate, because sometimes their responses to student questions contradict to each other.
In terms of grading, the instructor is lenient. He find every opportunity to give out extra credits. I learned little from this course. All I did is collecting points.
As a conclusion, if you want to get a high score, go for it; if you want to learn something from the course, avoid it before the instructor is ready.
This is the class from hell for all of the wrong reasons. This included an absent instructor, a sarcastic TA, vague/contradicting assignment instructions, and extremely late grade returns. I (wisely) withdrew from the course the day before the deadline because I could not handle the stress. I regretted my decision at first, but the class did not get any better. Some say it got worse. I know this because I kept up with it on Slack and Piazza. All this being said, I feel like I would have gotten an A or a B in the course had I kept with it, but it would have put too much of a strain on my mental state.
The content itself is genuinely useful and interesting. I have little or no complaints as far as that goes. Several of the TAs did seem to go above and beyond the call of duty to try to salvage the class. You will read a ton about the R language. If you have experience with python Pandas library, then it should be (somewhat) familiar.
My advice to potential students is to wait until they fix the class to take it unless you need it immediately to fulfill the ML specialization. My advice to GT is to cast this class back into the abyss.
EDIT: Also, the extremely positive review below is bullshit. The complaints about this class are legitimate. Many of the people “complaining” (including myself) have successfully completed many OMSCS courses without the kinds of issues experienced in this class.
As some others have said, this is one of the worst classes I have ever taken. The material is interesting, but the class is managed ridiculously poorly. I’ve done okay in the course so far, but communication is so incredibly bad I wouldn’t recommend this course to anyone. New professor this semester, so older reviews might not be super relevant anymore.
Fun and solid class, but very light in material. I watched all the lectures the first weekend, they were mostly review but nonetheless they did help hammer home some good fundamentals so they were worthwhile. Workload picked up in the second half of the class. I have taken 10 classes and this was one of the easiest.
Professor Lebanon stressed he wanted the class to bridge the gap between a purely academic class with one with applications more for the real world as well. As somebody who has served both as a professor at top research institutions and worked at respected companies in industry, he is qualified to design such a class.
I took the class in Fall and I see there are negative reviews from Spring. Not exactly sure what happened, because the lecture materials are the exact same and I understand the assignments were almost the exact same. Sounds like assignments not being returned in a timely fashion is a valid concern, but I imagine should be fixed with future iterations of the class. But some of the complaints about the material and the ambiguity of grading seem unwarranted, I don’t necessarily expect a class to have to explicitly lay out grading criteria ahead of time and they weren’t in the Spring and students did not complain. I am puzzled why reviews are THAT different between semesters. Maybe the class got a reputation as an easy intro class (I would say mostly true) and maybe this shifted expectations and student demographic of the class.
I got a lot out of this course, and I enjoyed it!
This course covers R and data visualization in the first half of the course, and then it goes into both theory and use of logistic and linear regression in the second half of the course. I think that the first half of the course is generally useful and not overly challenging. They do provide pointers to an entire book on visualization using the ggplot2 package, which if you read from cover to cover and play around with it could satisfy those who already have exposure to R and ggplot2. The second half of the course will probably be challenging if it is your first exposure to linear and logistic regression, and it will probably not be overly difficult if you have studied those topics before. I did not have that exposure before this class, and the rest of my review assumes that.
As for pacing, the class is relatively slow in the beginning. You are probably looking at more like 6-8 hours per week on average of required work. If you dive into additional references and optional assignments, you can spend more (and get a deeper understanding). The pace really picks up from the first project onward. As students in past semesters have noted, the projects will take up as much time as you have, and you will always feel like you could have done more. On average, you are probably looking at more like 12-15 hours per week at that point, and it may be bursty.
Now, as for the class logistics… I think most of the reviews for the semester have been at an overly negative extreme, and the one just below seems a bit too positive (hello fellow undergrad alum!). There were problems. They were not handled in the best manner. Piazza became out-of-control negative. Communication from the professor was often sparse. CIOS feedback should be given to avoid this in the future. I am proceeding under the assumption the professor will take CIOS feedback to heart. But if he doesn’t, I would avoid his classes in the future.
“What can one man do against such reckless hate?”
I hate to make this sound like an amazon review, but DON’T LISTEN TO THE OTHER REVIEWS. The people ridiculing this course are drama queens who take no self-direction and want everything handed on a silver platter. Most likely they’re industry professionals out of touch with how education has changed in the past 20 years or graduates of inferior schools. Honestly every point made by other reviews is complete bull**. The instructor/TAs are very helpful if you have questions and easily accessible via Piazza, Slack, and email. You can ask them anything related to homework, lectures, and even more in-depth questions related to the material. Every time they WILL help until you confirm your questions are answered. I just can’t fathom how people are saying they couldn’t get in touch with the TA’s when they are literally on slack every single day at set hours. Honestly, it seems that people who waited until the last minute to do their assignments couldn’t get the right answers.
Next, I didn’t see much ridicule about lectures or course content, which we can all agree was thorough and useful. Although I did wish we spent less time on R and include another project. The homework and projects were fair with more than enough time, but once again people waited until the last minute to ask questions and panicked when they didn’t receive a step-by-step solution to solving the problem… pathetic.
The only complaint I will make is that grading took way too long. However, if you really need feedback you can just ask the TA if your code is correct. Regardless, you should know if your code is working based on the questions, so the feedback is not useful except for stating why you lost a few points (maybe for not including enough graphs).
Don’t let other reviews scare you. I know I sound like a drop in the ocean, but being a GT alumni I know the difference between a good and bad course. This is definitely the former.
I think it’s all been said already, so I’ll merely agree with all the other awful reviews. Worst “prof” ever. He should certainly not be hired to teach in our program again.
Also, I’ve seen some students dismissing negative reviews on G+ several times before saying those are just the students who weren’t prepared or didn’t get good grades, etc. So I want to make clear, I’m getting an A. Doesn’t change the facts.
Some of the TAs were helpful in getting through the course despite the madness: thanks Ravish, Ryan and Adrian. This could be a great class, but not until they get a real prof.
Like other people commented, there were major operational issues in this class and I ended up spending a more hours than I hoped to make sure I interpreted the problems correctly. The majority of the frustration come from communication gap among instructors, TAs and students and unclear grading criteria without clear rubrics/expectation. The content seems reasonable and it could cover more visualization techniques in my opinion. Overall, it was disappointing that things did not work out well throughout the semester and hope there is major improvement planned by the next offering.
If you want to learn R and basics of linear and logistic regression techniques, it may be worth taking even.
It seems I enrolled in this course at an unfortunate time, as the new instructor this semester didn’t seem prepared to teach a MOOC (curious how different his on-campus performance was). Regardless, the lack of timely feedback and consistency of assignment clarification led to a lot of unnecessary confusion. I’m all for an appropriate level of confusion regarding course content, as that’s called “learning, “ but this semester was a logistical mess. Props to Adrian Chang, our TA who went above and beyond in grading, Piazza responsiveness, and overall engagement. Tough to do for 500 students, especially when the prof was frequently absent or seemingly unreachable.
Regarding curriculum only (in hopes of logistical improvement for future course iterations): I don’t come from a math background, and could have brushed up on calculus before enrolling. I thought this course would be more focused on visualization principles. There was some ggplot experimentation with project 1, but it’s mostly about using basic ML models and R to analyze data. Glad to get a first taste of logistic & linear regression, but am pretty convinced I don’t want to use R or ggplot ever again.
This Course is my first course in GT OMSCS, I preferred this after reading so many Good reviews about it, But it left me frustrated, thinking why I picked this one. Course material is pretty easy, but the way Projects/Assignments are handled is pathetic, not worth of recognized US University. Too much of ambiguity, too much of (needless) load, makes learning second priority., e. g. Use of loops, run it 5 times, 10 times, 500 times, sorta questions, asking Complexity, No doubt this is essentials, but DVA is not meant for that. And cryptic messages for communication, asking TA’s Or Prof. question over piazza or slack, u get very terse/generic feedback. I just couldn’t realize Did TA/Prof. wanted to remove ambiguity or add to it. Like any stupid student I dared sometime, to re-ask the next question, to get No further reply. Some of my Piazza posts remained, unanswered. If I pass this course, it would be because of my own work and students who were active and prompt on piazza. Newly hired Prof. n TA’s, Couldn’t make this course interesting, neither they could make us realize potential of the course. Project/Assignment grading took forever, Course is about to finish, and we’re yet to get feedback for half of our work. With all due respect, I really doubt their credentials, to carry on, GT plz review this staff. Despite student asking for ideal answers, we never got any positive feedback on it. Grading is generous at times, but, rather than that, I’d prefer, what went wrong and how can I improve it. Ohh, and how can we forget, TA Jiachen Yang’s epic sarcasm. I think This guy, should be given privilege to write, pre-requisite section of DVA/and every other subject in OMSCS for next semesters. Fall2017/Summer2017 ** Won’t recommend this course, until GT reviews this **
Similar to how others feel this semester, I feel like this class didn’t go very well.
Inconsistent communication between the TAs and instructor when clarifying assignments, slow grading, much less responsive on Piazza than the other class I’ve taken, small mistakes in the assignment instructions…
I felt like the video material was OK (certainly better than the rest of the class), but there were still a lot of confusing parts, like quizzes that weren’t sufficiently explained or that attempted to check free-form answers automatically.
Overall I’m pretty disappointed in this class and I think it wasn’t worth taking compared to other classes. That being said, I do find that some of what I learned, especially about visualization is quite useful and possibly relevant to my job.
I’m writing this one week before the semester is over. If you are registered or waitlisted for this course PLEASE read the Spring 2017 reviews!
This class has been the worst I’ve taken in this program. I’ve rated this class as Easy because the contents are basic and the assignments are not complicated. This course should be seen as “Learn R by example”. However, having a large class created dozens of different interpretations to the homework instructions that eventually led to great degrees of confusion. I have a perfect score in all returned grades so far and it has been because I’ve stayed away from Piazza and office hours. I just read and re-read the assignment instructions always thinking what else I would want to see if I was grading this assignment. I know that some will say it is not the student’s role but the methodology of going above and beyond has worked well in this program.
One of the main issues this class had was amount of contradicting information given many times. In several occasions it was mentioned to us that the TAs had a hard time reaching the professor when asked to clarify.
The staff in this class needs to be evaluated. It was evident the TAs don’t talk to each other and a few do not know how to handle students’ queries. One in particular tried to seem tough in front of the class justifying their answers by telling everyone that this is GT (I agree with this argument). However the tone and level of sarcasm used in (their) posts came out as confrontational. You are an instructor and if you are part of the team that answers the forums you need to work on your communicational skills! If you are frustrated, take a break and answer a post in a couple of hours or take a walk before answering!
As a side note, there is a review here that claims the professor was blaming the TAs for “dropping the ball”, That is not true, it was a TA that said that.
TL;DR: Wait for this class to be reviewed by GT!
I would rate this course zero stars if possible. I have a master and PhD in EE and completed thirty plus online courses on Coursera and Edx. This is sixth OMSCS course and by far one of the worst three courses I have ever taken in my entire life. I would quite the program if this were my first course. The content is very simple, some quirky R language, logistic/linear regression, and regularization for feature selection. The near nonexistence of course organization and lack of pedagogy make the course very difficult. Basically, the professor plays the video recorded by another professor and TAs do the grading without interacting to students. I requested clarification on Piazza via a private post. A TA replied after 14 days and with 3 reminders. The professor lacks basic pedagogic skills. There are a lot of ambiguities in the project spec, e. g., phrases “describe as precise as possible” are used in the project spec for HW2 instead of explicitly listing all items are required. Another example is that he asks the students to describe patterns in a plot and later we are penalized for not explaining the plausible reasons for the patterns according to the grading rubric. It is funny that the grading rubric is released two or three weeks after the homework were submitted. The project spec is so ambiguous that there are literally hundreds of posts asking for clarifications and contradictory information are posted by TAs and the professor. There is one TA responded students post with sarcasm and in confrontational tone. I believe this course is bad enough to raise attention from the school. Andrew Ng’s Machine Learning course on Coursera is better alternative for basics.
Update: Constant delay on releasing grade and even the final exam was released a few hours late. The last two extra credited assignment were obviously rushed out and questionably violating schools rules. Received “A”, but this does not change facts.
I learned most of content by myself. I didn’t see participation of prof and TA to make this class interesting. Semester is almost over but my initial assignment grades still not out. Class is not very well organized which make it tough. I recommend don’t take this class unless it is fixed. This is worst class among all courses that i have taken so far
Worst out of the 9 other classes I’ve taken. Don’t take this class until its fixed. It’s all in R anyway. So that’s a skill no one needs (Python exists). I guess if you really want to know how to make a graph using R while deciphering the worst instructions ever created for any assignment in the OMSCS program this might be the course for you.
Prof rarely participates. To quote Arpan Chakraborty (the Prof) “The TA’s have dropped the ball. “ What kind of Professor blames the TAs…
New professor this semester. Material is not so hard, but the requirements are vague and ambiguous. You lose points if you did not successfully interpret what they really want(which is the most difficult part of this course for me). Different TAs reply the same question with different answers. I guess they don’t even align before answering questions. Some of them are arrogant. I still remember a TA answered students’ clarification request on a homework requirement like ‘there is no ambiguity here’ even though more than 10 students asked the same question. updates: so the professor showed up one week before the final on Piazza. I also heard some students filed official ‘academy grievance ‘ and it seems administrators are on students’ side. You know who is wrong now. I got an A, btw, which ensures this review is not a ‘grievance’.
Most of my time in this course is finding out how to make R do what I want it to. Grades are slow to be released, this semester’s batch of TAs had a few that “dropped the ball”. We have a second project due in two weeks, but have not received grades for the first. The assignments seem straightforward, but the grading can be harsh if not interpreted exactly like the TA expected. Verdict at this point is to skip this course until they tighten up some of the assignment specifications, and perhaps add more lecture content.
I feel like Georgia Tech and Udacity pulled a fast one on this class. GaTech couldn’t find a professor to teach so they made Arpana Chakraborty, a Udacity course developer/instructor as an adjunct lecturer to teach under the GaTech brand. In his own words, “I apologize for being absent for the past couple of weeks; I’ve been busy with work outside of this class”, excuse me? He clearly does not take teaching nor this job seriously, nor concern about the success of the students. I have taken 4 classes, loved them all, but not this one. Arpana and few TAs (BTW, apparently we are all unworthy to TA Jiachen Yang) are clearly inexperienced and ill prepared. Awful communication skills, no rubric in assignments, students repeatedly complain having to read the TAs’ minds but the staff so far provided no resolution regarding the issue. There are extra credit assignments that are worth 1 point each in final grade, but it’s all 1 point or nothing, providing little or no incentive to complete the extra credits since they are poorly defined. TAs are poorly coordinate among themselves and routinely release contradicting information, which snowball into a sea of Piazza threads. No peer feedback, no exemplary work. Project 1 was due Mar 6, 2017, still haven’t been graded as of April 4, 2017 due to some TAs dropped the ball. Not looking forward to the final exam which account for 30% of the grade. One big reason I join OMSCS was to avoid this kind of teaching. GaTech needs to evaluate the staff carefully to uphold and preserve the integrity of this program. I suggest the staff take the playbook from other courses like ML4T or CP or KBAI for future version of this course.
I am adding this in the middle of the term so that it is useful for students registering in Fall 2017. So far, we have completed 3 home-works and one projects. One more project and final exam is remaining. The course starts out easy, first homework is very easy [introduction to R concepts and programming], second homework is moderate [introduction to ggplot2 and data visualization techniques]. The first project was analyzing movies dataset. There were ten questions with lot of subdivisions and each was equivalent to one homework. The third homework is implementing logistic regression algorithm from scratch and optimizing the code so that it runs fast thus enabling to train and test on the dataset given. I am new to ML domain, and the lecture covered in the video was not at all enough to complete the homework, I had to watch the relevant portions in Andrew NG’s course to gain the intuition behind logistic regression. It was hard work but at the end, I felt good implementing the algorithm and training it. This course starts off as easy and by the time you realize that the course is taking all the free time you have, it is well past the drop deadline. I would strongly advise students with no experience in ML domain to take this course as a single course. If you want to double up, it can be paired only with courses which are front loaded like the AI4R course. I have put 18 hours per week but that is applicable only to the second half of the course.
Looking at the materials I thought this would be an easy introductory course. Though the content is introductory, the poor organization of the course makes it too difficult. There is no specific rubic for grading the homework and projects. You will end loosing marks for unnecessary reasons that were not pre-specified.
I wouldn’t recommend this course as the first course. It would give you a bad impression about the program.
The course itself it is not hard, but it is very hard to get good grades. There is no rubic or grading policy at all. And most of assignments questions are ambiguous. TAs are very lazy and don’t answer the questions or answered in a very misleading way. Then at grading stage, they will surprise you and interpret the question in a different way it is being asked, and deduct points from the students. A few classic examples: 1. what is the growth rate of different format of file size? (They graded with big(O) notation) 2. Describe your observation (grading notes released after three weeks of submission saying, you can’t say A is bigger than B, you have to say A is 4% and B is 3%, so A is bigger than B). The answer is in the figure the students submit, but the reason most of us didn’t put them in the final conclusion is the assignment didn’t ask it in the beginning. Definitely the worst course have been seen in OMSCS
I was very disappointed in this course. It felt very rough - there’s little lecture and many assignments required considerable time searching the web to figure out things you needed that weren’t in the class material. Students found the assignments ambiguous, and piazza questions were answered very slowly and not in a clear manner. The original professor, Dr. Lebanon, was swapped out for Dr. Chakraborty and I just felt he wasn’t really prepared for the class. Assignment time started very short, but then ramped up to be very large. I found the 10 questions on project 1 were taking 9-10 hours each, which got to be a real problem. It felt like the instructors had not done the assignments themselves and didn’t have a good understanding of their sizing, plus ambiguity and uncertainty in grading (only 1 HW was returned by week 8) made you tend to overdo everything. It’s unfortunate because I like the topic and Dr. Chakraborty did take steps to restructure assignments, but the overall theme was “too little, too late”. Also the syllabus is strange - the second half is all basic machine learning, and didn’t feel like it belonged, while there was no treatment of visual esthetics - what makes a visualization look good vs. bad, how to style it best, etc. I would wait until this class is much more mature.
(To be updated hopefully once the term ends) The class this semester started on the wrong foot, and the general experience had been sour since then. It is a shame because the instructor is really keen on making the experience very educational and fun. He has prepared non-required activities where you can probably learn more without the pressure of trying to get a good grade.
What I think went wrong this thus far:
- Unpopular decisions made by the instructor team. Personally I do not mind that much, but it is very reasonable for most people not to like it. And with a class size so big, those small “mistakes” have bad effects in high order of magnitude.
- It seems the grading rubrics are not tightly coupled to the learning goals of the homeworks (or at least what I thought to be the goals are). The most popular “mistake” was giving huge deductions for missing complexity analysis in an HW where the goal is to get acquainted with R. Or deductions for not giving actual numbers when calculating those values are just incidental to the HW and the main goal was to visualize those numbers (this one is very minor).
I really hope things turn for the better soon.
The topics in the course were by the way not what I thought they were. I was thinking more on statistics: hypothesis testing, trend analysis (I think this one will be covered), knowing whether the samples are sufficient, blah blah. Stuff I would need if I want to do research and decide to do surveys, A/B tests, whatever. The syllabus was so short I thought there are a lot more in betweens. Not really, that is it. In the HWs, we were usually asked to give an analysis of results. But I don’t really know how to give those analysis correctly, so I end up giving very vague analyses (weakly related, higher, lower). I am yet to see my grade for HW2. The Instructor Team has an aversion for vagueness, so I am not very excited about my grade.
This is my 6th course in the program. By far the worst course I have taken so far in my entire life. I am hoping that they will do some changes. It is utterly disorganized and chaotic. TA’s and instructors participation in Piazza is minimal. Just do not take it until they fix the course. I guess Dr. Lebanon was great. We just got unlucky that they have changed the instructor.
This is what I’d call a “cake” course. You can coast with minimal effort. There are 6 hours of lecture - you can go through all of it over a week. Sure, you learn a bit about R, and a few data visualization techniques, but when it comes to machine learning, you learn the most basic, rudimentary stuff that’s taught in any ML course - logistic and linear regression. Seriously - take Andrew Ng’s course and you’ll know more. If you just want a course where you have to spend the least amount of effort, learn very little, maybe pair it with a very hard course, then this course is for you. If you want even a small amount of challenge, avoid this course like the plague!
I’d strongly recommend not to take this course in your first term. This is a new class with plenty of ambiguity and disorganization. In case you haven’t notice, there is an overoptimism around in this program, which is a good thing most of the time, but let’s be brutally honest, this class is not any better than many of the ML introduction courses that you can find online. If you already know something about ML, then this class will be easy, if not, you are going to constantly look for more references, as the epic Machine Learning course by Andrew Ng, or the Data Science specialization on Coursera. The final exam was a disaster, we got partial details of it until days before to take it. They told us that there were only 3 questions, but actually each question had 3 equally complex subquestions. There were topics on it that we didn’t see during the course and weren’t listed on the prerequisites, how were we suppose to answer those questions? we’ll never now because we never received feedback of the final. Apparently for some students, this is ok as it’s been like that on other classes, but come on! it is not ok! we need feedback to learn and do better, if not it is a waste of time.
This course was not well organized.
I loved the inaugural professor (Lebanon) as he had a very practical approach without neglecting the theory, either. The general workload wasn’t bad, but the projects could definitely suck you into trying to work indefinitely to improve results (self-imposed). All programming was exclusively using the “R” language… I was happy to pick up a new tool out of the deal! The biggest concern I had, but I accepted during the first, registration, week of class was the homework/project/final grades were all equally weighted at 1/3 of the grade. This was a concern and a bit of stress toward the end of the semester, but I was a consenting adult/student like everyone else in the class, so I didn’t think it was unfair. I would say the course was neither unbearable nor easy. I earned an A and think those who participated did reasonably well.
The content is very useful and applicable. And the professor made it easy to follow. The course is based on R and there will be an introduction to R in the beginning so no need to worry about the language. The grades are evenly distributed between 1/3 homeworks(4), 1/3 projects (2), 1/3 exam (1). The homework and projects are managable and relevant to the course content. Hwever, the final exam is brutal even though it is open book. Be prepared.
This course builds a very solid foundation with R and basics of data science. Though it does not delve much into many ML algorithms, it focuses well on visualizations and key regression algorithms. Since it was a first offering, the structure of the course was bit vague and exam a bit surprising. But that said, overall a good course that is very informative
This is a good course if you are looking for an Intro to R and to ML. The first half of the course is really basic R and most people will do well. There are projects and home works, but, I could not really tell the difference between the two. Course had 3 HWs (33. 5%), 2 projects (33. 5%) and one final exam (33. 5%). So the exam is a pretty big portion of the grade and is fully open ended. It tests your understanding of what is taught in the course. Overall, I would say this is a good course, but not an easy A if you are new to R or ML.
For those who are looking to take another course along with this one, it can be done. I took this one (DVA) and 6400 (DB-Systems-Concepts) and could manage the workload on both.
This was a great course! It focused on the practical side of data science, with very little theory. Our semester had 3 homework assignments, 2 projects, and a final. The lecture material is light (only 6 hours). The homeworks were easy for me, usually requiring only an afternoon. The projects were based on a movies database. They were both really fun, but can be super time consuming depending on the quality of the report you feel comfortable submitting. I probably went overboard on mine, but mostly because I was enjoying myself. Full marks on the project are likely achievable with about 10-15 hours worth of effort (and we were given 4 weeks). I felt the final was very fair and interesting. Several students had issues with it, but those appeared to be more related to how proctor-track displayed (or hid) the exam questions, rather than with the content itself. The grading tended to be lenient this semester, but that may change with the new instructor. I’m sorry to hear Dr. Lebanon wont be teaching next time. He did a fantastic job. The teacher and the TAs were very active and responsive in the forums.
Interesting class which teaches students about real world and not just book knowledge. That being said, the class would be fairly easy for someone who already knows R programming language or has already taken machine learning class. I personally learned the concepts in class so i had to spend alot of extra hours to get use to it. The professor is very knowledgeable and was very active on piazza as well, but I felt I learned mainly from other students who were already in this field or were fairly knowledgable and contributed their knowledge on piazza.
This class is great! Professor Guy is very knowledgeable and taught very well in terms of practically use and theory behind regression, feature engineering and feature reduction. This course is really useful for daily tasks or project for data science work
Hmm, interesting course overall - in the end, I got the A I wanted but my brutally honest assessment of myself is that I didn’t earn that A. In terms of the R programming side of the course, I use R on a daily basis at work so the assignments were a breeze. The projects were cool and make the student think and formulate informed hypotheses about the data. Based on my own grades, I felt like some parts of the project (the second project in particular) were graded too leniently (i. e., some parts where I felt I put “adequate” answers, but nowhere near the best answer or result possible, I still got full marks).
The final was baffling (in terms of the grade I got on it). My honest assessment of my performance was that I probably earned in the 65-75 range, but I was shocked to see that I scored in the 90s. It was very open-ended and I made sure to answer every part of every question, but I do feel like I missed the bar on some of the questions in which I received perfect marks.
These are good problems to have, I guess (rather than the other way around).
Overall, nice class, and it’s unfortunate that Prof. Lebanon (who produced the very nice lectures) will not be continuing as the instructor of record. Given that, it remains to be seen how generously this course will be graded in the future.
This is a decent course. Though beware of the comments here because Prof. Guy Lebanon will not be taking Spring 2017. With that in mind I found this course to be a tad too easy. If you have a background in vector calculus and R the homework would be straight forward with basic data munging and proofs.
The final exam was… strange and the projects were sufficiently challenging. It will be curious what future iterations look like.
Course mainly covers R, tools for data visualization in R, Logistic Regression / Linear Regression, and regularization.
The actual content in the course can be considered light however, the videos and the supplemental readings are so well done I wish every other course could use this course as a template. They are really produced in such a way that is not too math intensive but uses enough to give enough depth in the concepts reviewed.
The professor and TAs are really involved. They definitely seemed to want to help and resolve all student issues.
The only thing I disliked about the course was the final. The professor did not prepare us for the final at all. The content on the final was related to what we had learned in the course up to the point but was applied in a completely different way than how it was applied in the homework and project. Also, the final was 33% of the grade (way too high given the type of test given us). I would of preferred an additional midterm + final composing the grade instead of a singular final.
Everyone should take DVA before Machine Learning to reduce the initial learning curve of ML. It’s a great introduction to R and data analysis. There’s some math involved as well, although not too complicated. The workload is moderate, light in the beginning, but picking up during later homework and the projects. The class is a good combination between practical and theoretic. I end up actually understanding regression and the math behind it and become quite proficient in R. It’s such a bummer that the professor will not teach next semester. He is very involved with the class.
Really great course. The TAs were active, helpful, and obviously worked hard. I felt bad when a student essentially posted that the TAs weren’t worthy to grade his work without PHDs. Many other students appreciate your grading!
The first half was a lot of visualization and getting started with R, which was nice for me because I have never used it. The latter half was similar to ML class, not quite a subset but very similar topics. I’m sure I didn’t do great on the final, but I won’t let that change my review on the class overall.
It is a good introductory course to learn R and linear/logistic regression, however there isn’t much guidance so you will be learning on your own if you don’t have prior experience. The professor and class are very organized and there isn’t too much work, but the class had a lot more potential. The instructor works at Netflix and has great experience, but IMO didn’t teach much from his real-world experience. I didn’t learn any neat techniques that I could apply in a professional position or show-off in an interview, just the basics that I could teach myself from random online resources.
This course is deceptively easy. It starts out really slow learning R and plotting. Then as you get into the algorithms it picks up in speed so don’t let the start fool you.. The first project is relatively easy since it is mostly just figuring out R libraries. Part 2 takes longer if you really try to apply everything including things that may not have been covered yet. Make sure you just plan accordingly and don’t wait until the last moment for things. That is the only time you will really run into trouble.
Overall, I thought this would give more industry experience given that Prof Lebanon has worked at a variety of large companies. I think I learned more from other students who were nice enough to post explanation than I did from the professor. I would also recommend getting The Elements of Statistical Learning as an alternate reference for this course since there is little reference material provided. Also, brush up on linear algebra and calculus before you get to logistic regression.
This is an excellent foundation course for anyone in general, but for anyone considering ML, this will give you a good course to start with. The course is very well structured, and the prof. and TAs were very active. If you want to prepare for this course, I would suggest you complete the Exploratory Data Analysis in R course on Udacity to get a heads up with R. This way you can start with the more interesting topics which start after the mid semester. For math, I found most resources I need from just Khan Academy.
Really enjoying this class as my first foray into machine learning.
- Workload is dependent on a few things: your prior experience in ML, and your willingness to apply yourself. I myself had no prior experience and really wanted to apply myself so that I can properly internalize the core concepts and found myself working around 20 hours a week. However someone with experience in ML and ML4T can get by with no more than 10 hours a week of effort.
- Math is a tough thing to get up to speed on but we have had some really engaging piazza discussions that solidified my understanding and the students really take it upon themselves to make sure their fellow classmates are in a position to succeed. It’s great.
- Instructor is very responsive and I have yet to see a relevant question unanswered by him. Hes got some great experience in ML and does a good job of breaking down the concepts into digestible chunks.
Final takeaway is to definitely take this class if you are in the ML track but if you are already quite proficient in ML and R, this class may not give you anything more.
This is a great class, and how classes should be structured. I agree with most of the other comments mentioned: workload is do-able (plan at least 12-15 hours a week, as the projects take far longer than you will anticipate), there is roughly 9 hours of video total, and the instructor is very active and very reasonable.
The only thing I will say differently is the math, at least for the last homework, was challenging for someone rusty in partial derivatives. Fortunately this part of the homework is the only homework / project that you actively need a math skillset; fortunately the instructor is very lenient on discussing the math section, to the point where a legendary student (who is also a math professor) walked the class through most of the problem and there was no complaint from the instructor. I will say brush up on Calculus (particularly derivatives, partial derivatives, and the chain rule), Probability, and basic Linear Algebra and you should be fine.
This is an excellent class. I have learned a lot about R that helps me writing predictive analytic coding effectively. Tutorials are so exceptionally well done and applicable it is misleading that it is simple. Exellent instruction which makes leaps into application reasonable does not mean it is easy. This is how courses should be structured. The projects and homework are challenging, not simple. They apply what you are being taught while requiring a stretch for you to apply the concepts. I thought this was full of a lot of practical and meaningful work, and highly recommend this class to anyone. You’ll really love the assignments. I did. Very interesting to do. Also, just so you know, how people are saying it takes 3 hours is beyond me. Not if you are including pacing projects and homework into your assessment. I read a comment about 30 hours per project over 2-3 weeks… and agree. Plan well, 30/(2-3 weeks mentioned) = about 10-12 hours time.
The workload in this class was relatively light. There’s only six lectures, very similar to the style of AI4R. The course is very heavy on R programming, and at least some form of statistical background will help you a lot in this class. Coming from a SAS-heavy background with no knowledge of R, I’ve appreciated the opportunity to get my hands into data manipulation and visualization using R. We also had the opportunity to learn the mechanics behind linear regression, logistic regression, and regularization.
There are 4 homeworks (technically 3, but the 3rd counts as double and is twice as long), all of which have been doable in a reasonable amount of time. The project this semester is divided into two parts, and is about analyzing and visualizing movie characteristics from IMDB’s database to find the best predictors for gross revenue. I thought it was pretty neat to be given a real world problem with imperfect data to analyze. Don’t procrastinate on the project. Unless you’re really handy with R, it’ll probably take 30-50 hours for each part (divided into two parts, due at different times).
I really enjoy watching the lectures (sped up to 1. 5x), and the professor provides the lecture slides so you can actually just sit, comprehend, and follow the notes rather than screenshotting the video every two minutes. The professor is very active on Piazza, which makes up for the extreme lack of TA participation.
Preliminary review, so that people choosing a course for next term, might have some information at hand. The instructor is very active on Piazza (not really sure if it’s just because this is the first time offering). Office hours are bi-weekly held. Video lectures have been great so far, although there are only 6 hours of video for all the semester (with a lot of in-line quizzes, maybe that increases the amount of time dedicated to the lectures)
The course is completely R based, but don’t be afraid, you learn it from scratch. I’d say this is a good course to pair with another one, it has a medium load. Also I think this would be a very nice introduction to ML topics. Basic calculus/probability concepts are needed for the second part of the course.
Homework and projects feature open ended questions, where you explain/justify your answers (report) and also provide the R code you wrote to justify your analysis (tables, plots, etc). It’s also helpful if you already have experience writing “vectorized code” (like in Matlab, Python with Numpy/Pandas, etc).
Syllabus:
- R programming (most of the course will be in R)
- data visualization
- data processing
- logistic regression
- linear regression
- high dimensions and regularization
Each one of those “chapters” is covered in two weeks, so typically in that lapse of time you watch the lectures and work on the homework/project associated with that chapter.
Grading: The course features three assignments, an individual project (divided in a midterm project, and final project), and a final (open book, not multiple choice) exam with the grade distributed equally between these components (33% all hw, 33% project, 33% exam). “
Final review: The class only has 6 parts to it: R programming, Data Viz, Preprocessing Data, Logistic Regression, Linear Regression, Regularization. By the second part of the class, and the class continues to be heavy on R syntax and light on actual concepts. The part on Logistic Regression becomes a bit more concept oriented (and less R oriented). In general workload is still very light, except the projects took an average of over 30 hours to complete (but there were 2-3 weeks to do it). So make sure you start work early. The class features interviews from Netflix data scientists. That is interesting to watch. Professor is active in Piazza. Good experience with Youtube Live office hours. The class organization could be better, TAs don’t make too much presence on piazza. I think the class could have taught a lot more, while its a good intro to Data science, the amount of content is a bit underwhelming. 12 lessons would have been better and still manageable.