DBM/502 DBM502 DBM 502 WEEK 1 ASSIGNMENT 1
- University of Phoenix / DBM 502
- 04 Oct 2017
- Price: $10
- Other / Other
DBM 502 WEEK 1 ASSIGNMENT 1
Assignment 1
Individual: Database Fundamentals Paper
Assignment Preparation: You have been selected to run a major project for management of data at one of a number of organizations, with the organizations and their projects described in separate attachments. (Individual Assignment A, Individual Assignment B, Individual Assignment C)
Note. Your course facilitator may select certain choices or create his or her own. Therefore, do not assume that the organizations and projects available in one section of this course are the same in all sections.
You may create your own organization based on an organization and project with which you are familiar, but if you do so, you must provide materials covering each of the following areas described for the provided examples:
- Description of the organization
- Description of information the organization needs to track to support its daily business, and how it will be used by the organization
- Description of the most common or urgent activities that will access this information, how often they occur, and the expected response time
- Description of the organization's needs for investigating trends in this information over time (optionally, include information collected in other existing databases)
- Description of other data sources available to the organization, internally or externally, including some sources of data that are unstructured
Week 1 Scenario
You have a challenge in your project: you do not have a team, a process for development, or any tools for data management with which to do the job. You need to develop a plan for them.
Write a 1- to 2-page paper that addresses the following points:
- Define at least five key roles for members of your team involved in the definition, creation, and maintenance of the database and applications that use it.
- Give a job description of each role, with the skills and levels of education or certification you expect in each role, along with the major tasks that each team member will be expected to perform. Use salary information from referenced sources to suggest a range of possible salaries.
- Describe the major tasks that the team will perform, and the natural order of doing those tasks the first time.
- Explain key tools that team members must have to perform this job.
Submit your assignment using the Assignment Files tab.
Individual Project A
Introduction
This project is inspired by the description of real studies done by IBM for PGE and Honda (Mayer-Schonberger and Cukier, pp.102-3).
Organization Description
Major Electric Corp (MEC) and The Electric Car Company (TECC) have formed a joint venture, Estations Inc., to create stations in carefully chosen locations around the city where people will be able to both park their electric cars for the day while they go to the office, and recharge them. Estations will use off the shelf software for most common functions, but is creating a special organization that you will head for creating and managing applications and databases.
Information Needs
Estation has a growing number of locations around the city, each of which will appear to be a simple parking garage. Each space in these garages can be configured as either a quick charge space, where a car can be fully charged in about 30 minutes, or a standard location, where it can take 3-8 hours, depending on the capacity of the car’s battery.
Estation needs to track reserved spots for those drivers who need to be sure they can charge their cars.
During the period for which this system is planned, we hope to reach 400 stations, though if things go especially well, it might be 1000. The stations will vary in their configurations, depending on the available space, the density of population of people with electric cars in the area, etc. Each station will have multiple levels (usually at least four, with some major city locations having as many as 20), with multiple aisles on each level (often 2-3, but some really large locations might have 6). Slots are on each side of an aisle, and can be anywhere from about 10 to 50 slots per side. We hope that most of our locations will eventually be filled 70% of the time through the 8 AM to 6 PM period, and 30% of the time the rest of the day. Each location may have different hours, depending on what is profitable for that location.
In some locations, all or part of the top floor will be dedicated to solar panels to collect electricity. This has the obvious advantage of cutting our electricity usage, particularly during the hot, sunny days of the summer, when electricity costs are often the highest. At least as importantly, it is a marketing tool for our environmentally conscious potential customers, who will like the idea that their cars were being charged by solar power. This does have costs and risks. Dedicating space to solar panels means costs in the panels themselves, as well as in the opportunity cost of not having those spaces available. It may or may not actually produce much energy, if the area is often cloudy, or a neighboring building puts the top floor in shade for part of the day.
We will need to be able to provide an app (probably using GPS and Google Maps information) that will help customers find a location for a particular period of time, which is close to either where they are now, or where they plan to be. They will tell us when they plan to be there, and when they plan to leave. The app will show them locations near that spot (perhaps eventually taking into account transit options from the locations). It will show them capacity that is not yet reserved for some portion of that period that will be able to charge their vehicle in the time they plan to be there. There is also an option for the customer to say that they will accept spaces available that would only provide a partial charge (mostly when they won’t be there long, and don’t want to use the quick charge space). They can make a reservation, paying by credit card, or wait until they arrive and take their chances on getting a space.
We will need to track our electricity expenses, where the rates can vary from hour to hour, based on demand on the grid, and also the production of electricity from our panels. We will also need to track equipment failures, both for the routine purpose of not placing customer cars in spaces that don’t work, and to let us analyze our costs of repair, and failure rates of different models of equipment.
Common Transactions
Probably most of the activity against the database is the app activity:
1. Find a location near the customer.
2. View the floors with available space during the period the customer will be there, with the available spaces indicated with their type.
3. Select a space, and then reserve it and pay for it.
4. Cancel a reservation.
5. Make a continuing reservation (applied to credit card monthly)
There will also be
6. Monthly runs to bill continuing reservations.
7. Reports to see where we have especially large amounts of available space, for use in generating promotional emails to previous casual customers.
We expect about 30% of the spaces to be reserved on a continuous reservation during the day. The other 40% we anticipate filling, plus almost all off hours spaces used will be filled by people making requests of types 1, 2, and 3 each day. There will probably be at least 50% more requests than we actually get reservations. We hope cancellations will be no more than 10% of the total.
Historical Reporting
We will need to investigate a lot of different possibilities, not all of them predictable. These might include:
1. How profitable are the different locations over time (individual locations, locations in particular cities, locations of particular sizes, ones with and without solar panels)?
2. How much electricity are we getting from solar panels? How does this vary across time and location?
3. How do grid electrical rates vary over time? Currently, we do not charge customers varying amounts based on the varying rates, because of the complexity and public relations challenges. Are we losing money on some customers because of this?
4. How often are the locations at capacity for one kind of service or other?
5. How often are the quick charge spaces full? How does this vary by locations?
6. How often do cars remain in spaces of either type after they are fully charged?
7. Where did casual and continuous reservation customers hear of us?
Other Data Sources
We have access to information from government agencies on weather in the area, including indications of when it was sunny near each location or potential location. We have a long series of information on grid electricity rates in different locations over a few decades from our electrical company parent. We have data available from our car company parent on the amount of electricity need and maximum charging rates for the different models of cars (both the ones they manufacture and others). We have financial information about our construction costs, staffing costs etc. from existing off the shelf systems. We have access to data from the government on employment by category in different census tracts throughout the city, including statistical data on the typical salaries in each area. In some cities, we also have statistical information available on commuting patterns, indicating where in the region the people who work in a given part of the city typically live.
References
Mayer-Schonberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think (pp. 102-104). Boston: Houghton Mifflin Harcourt.
Individual Project B
Introduction
The project is fictional. Any resemblance to any real organization is purely coincidental.
Organization Description
Big Online University (BOU) has over 100,000 students, all of them attending online. It has many existing systems and databases used to track students, course registration, grades, finances, etc.
You have been brought in to develop a database for the office of assessment, as part of a program to help BOU evaluate the student success of its programs.
Information Needs
Every program at the university has learning objectives, which define broad areas of skill or knowledge that students should have upon graduation. There are additional objectives associated with the common curriculum that all students need to achieve.
A student may be in one or multiple programs (e.g., as a major or minor area, or a certificate program). Each program is comprised of both required and elective courses.
The office of assessment arranges for students’ learning to be measured either directly (through assignments in the curriculum itself, or through outside tests), or indirectly (through the evaluation of student learning by the students themselves, or in surveys of employers or faculty members).
The database must track student learning information, as well as a range of facts about items that might have influenced that learning, like particular courses or versions of courses that the students took, or the background of the students before they arrived (e.g., how did they perform in high school, are they working in the field that they are studying, how did they score on evaluations done at entry to the university?).
The database must also track actions that were taken to improve programs, and which courses were affected by them (and in which versions). This can be useful for understanding if the changes were effective.
Common Transactions
The database has far more activity in addition to data than in conventional access to data. Information about new students needs to be added as they are accepted (since this is a four year school, about 25% of the total students are in each class, and most of these are added in a large batch from the admissions database just prior to the beginning of the school year, when it is known which students will attend. The information for a small number of transfers, including records from other universities, must also be added.
There will also be a large quantity of new information from direct assessments, concentrated in the two weeks that end the two semesters. Roughly one class in five has some assessment data generated for each student, and the typical student takes 10 courses a year. The volume of information from indirect assessments is much smaller, with a sample of about 10,000 surveys (between current students, alumni, employers, and faculty each year, and these are spread throughout the year.
Updates are made to courses taken by students as they add, drop or complete them.
Each semester, the office produces statistical reports about the learning of all current students for each program.
Historical Reporting
Much of the use of assessment office data involves historical analyses with data from this database, often combined with other sources. The university wants to understand which versions of programs produced the highest learning success, whether programs are effective for students who currently work in the field, or for those who do not, whether there are certain indicators in the backgrounds of students that indicate whether they are likely to learn, or even graduate, etc. The university may also need to understand where in the curriculum students are learning (since some learning objectives may be partially covered in several different courses.)
Other Data Sources
The university has information about how students performed in particular courses, whether they graduated, how long it took to do so, etc. Indeed, since the courses are all online, the university can potentially track every action in every class (how much time did the student spend reading the texts, how often did the student participate in discussions, which problems did the students answer incorrectly on the automated math tests, etc.). The challenge is mostly to determine what is useful.
There is also outside information about student loan defaults, salaries that alumni obtain, what fields they may be working in (e.g., you can see this on LinkedIn). All of these may potentially be useful in understanding what kinds of students succeed in learning and applying their education.
Individual Project C
Introduction
This project ties data management into networking, though at a carrier level. The numbers are all made up (though they should be plausible, in most cases).
Organization Description
Smallton Rural Telephone Cooperative has received a grant to provide Fiber To the Home (FTTH) service to its 20,000 subscribers in Smallton and the surrounding rural county. To support this, SRTC plans to develop a database and associated applications to allow it to track this new network, and the services provided on it.
Information Needs
SRTC will need to track the network itself. The network is a Gigabit Passive Optical Network (GPON). At SRTC’s Central Office in Smallton, there will be multiple pieces of equipment to connect to video feeds from different television networks, and high speed routers connected by very high speed equipment to the neighboring major carrier to connect Smallton to the Internet.
These are connected into high speed dense wave division multiplexing equipment that can place 96 different optical signals, at 2.5 Gbps each, onto each fiber connecting to a neighborhood (in either direction). Once the fiber reaches the neighborhood, there is a passive device (requiring no electricity) directing three signals onto outgoing fibers, which reach connection points in front of each home. These are typically used for television (outgoing to the customer), downstream Internet (outgoing to the customer), and upstream Internet (incoming from the customer).
Connections are not made from the connection point to the home until the first time service is ordered. It costs the company several hundred dollars per home to make the connection, and since the grant only covers the more expensive fiber and equipment reaching the yard, the company does not want to make connections to the house until there is an order that will generate revenue.
The database will need to track all of the fibers, equipment, and optical signal, as well as the equipment connecting Smallton to video and the Internet. It is particularly important to track which homes have been connected (and, until the network is complete, which homes have connections all the way to their yards, and are thus available to have orders).
The database will also need to track orders, and where they are in the process of being delivered (a few simple electronic activities if there is a connection to the house, but digging trenches, running fiber, and installing the fiber optic connections at the house when it is first connected).
Customers will not initially even be offered Internet connections at the full theoretical capacity of their links, because the cost of the routers and high speed connection to the neighboring company would be very high at those speeds (if all of the 10,000 homes in the area were at full capacity, this would be 25 trillion bits per second). Part of the setup of each Internet customer is to tell the router the maximum speed to allow. Customers may purchase either video or Internet or both, and may purchase different levels of capacity for Internet in each direction. Only certain levels are allowed, which change, and need to be modifiable in the database.
Common Transactions
Early on, the transactions will be dominated by adding information about new equipment and connections. The goal is to connect to the yards of all SRTC area homes within two years.
After that, there will be a period where the transactions will mostly be additions of service at new homes. The hope is that this will grow to cover 50% of the homes in the area within 5 years.
In the longer run, this will probably settle down to a churn of about 10% of customers each year dropping service, and a different 10% adding service (either at previously connected homes or new ones). There will be a slow rate of new additions to the network, as new homes are built in the area. About 20% of homes are expected to change service levels and types each year (adding or dropping one of the services that they have, or changing the speed of the Internet connection).
Initially, about 5% of equipment will need to be replaced each year. Serial numbers and models need to be tracked so we can know where we have issues.
Extracts from the order data will be used for billing all customers each month. Also, the data is checked by a vendor supplied network management system whenever there is a problem identified (by the equipment or a customer call) with the connection to a home. Major outages (e.g., near the central office) might affect 1000 customers (and thus have 1000 reports, each of which ends up identifying 1000 affected homes).
Historical Reporting
Historical reporting will address penetration of service (how many homes can be connected, how many actually are, which currently have service, what services do they have and at what rates), considering how this has varied over time, or areas of the community.
It will also look at churn rates over time and areas.
Another important area is analysis of problems. We need to know how many customers are affected, what equipment types have the worst problems, whether there are seasonal issues, how long they take to repair, etc.
Other Data Sources
SRTC has access to data from the equipment vendors on failure rates. It also has data from its network management system on every report of a problem from different kinds of equipment (soon to include
information on the GPON network that this system will track. It also collects ongoing data on error rates and lost packets over the network, which some think may start to increase before the equipment fails. There is also geographic data and weather data from the government showing estimated rainfall, snowfall, temperature and winds in different parts of the SRTC area. There are also records of road construction activity throughout the area, as well as building permits which may (along with plowing by farmers) be associated with fiber cuts.