Aim of the Studio Project
The main aim of the Studio project is to supply an easy to use application, which would allow the recording and playback of digitised sound through a PC sound card on so-called IBM compatible PC's running a UNIX-type operating system, rather than the standard DOS.
On the UNIX system in question, Linux to be specific, the driver software is already in place, however for the novice and even the expert having to use only the tools provided by the driver software is tedious. Thus Studio aims to take the digitisation features of the driver and make a more accessible user interface by using windows, icons, mice and pull-down menus (WIMP). This application is also to include the facility to edit the sample data.
Another aim which spawned from the specification of the project is to use only software that is freely available from the public domain. The finished project will then be placed back in the public domain for all interested parties to use and modify as they please.
This document aims to portray the effort that has been made in achieving these aims.
At the start of the project I had only just come to the realisation that there was another operating system other than DOS for the PC. Thus research into this project started at a very fundamental level.
Understanding the fundamentals of the UNIX system and to get some familiar with its basic commands was the first step. Once familiar with that I started browsing the Internet and World-wide Web for information and programs relating to my project. This was an important source of research material.
The following will relate some fundamental topics to the Studio project.
Linux and VoxWare.
Linux is a UNIX clone for the 386/486-based PC's. It was written by Linus Torvalds with the assistance of many Internet hackers around the globe. It is a product which is increasingly gaining support, since it is a fully compatible UNIX system, with multitasking, multi-user and networking capabilities for perhaps the most dominant microprocessor in the world and it is open licence software!
An example of the way Linux has been enhanced by global hackers is its sound card driver. The software which controls the PC's sound card (the driver) was written by Hannu Savolainen of Finland. He spent hours of his free time writing this software for personal interests. In the "spirit of the Internet", he released this back into the public domain, and after subsequent revisions and improvements is now part of the distributed Linux system. The version of the driver, known now as VoxWare, that is used in Studio is 2.4.
The VoxWare driver sets ups the sound card devices for use within Linux, by creating up to 4 device files. These devices are,
1. /dev/mixer; The device in the sound card which allows mixing of the input lines.
2. /dev/dsp; The ADC and DAC or the sampling device of the sound card.
3. /dev/sequencer; The built-in synthesiser device.
4. /dev/midi; To control the external synthesisers that may be connected.
In the Studio project the second device is the most significant since it is only sampled or digitised sound that is required in the specification. The first device, the mixer, is also needed in order to ensure that the recording of samples is successful. Since recording lines can be disabled upon boot, and need to be switched on.
In order to play or record a sound sample file using these devices a UNIX redirection must be used as follows,
cat samplefile > /dev/dsp
The above command will play a sample from beginning to end. A portion of the sample could be played but only by complicated use of UNIX commands. Clearly this is tedious.
VoxWare 2.4 has very limited real time facilities, due to problems with buffering the sample data flow and the complexities of a multitasking system. Real-time features are under development and experimental releases (version 2.9) are currently available.
Fortunately VoxWare also provides a header file, which provide the interface to the sound card devices from within a C-program. The job of creating an application has been left up to someone such as myself.
X-Windows and it's toolkits.
Another important background topic is that of a windows environment. It would not be feasible to attempt writing a WIMP application, in the time given, from an operating system level, because of the level of programming sophistication involved in having to control the mouse and to process all the various user events. Thus making use of existent windows environments is mandatory.
The standard windows environment for a UNIX system is X-windows, which was developed by MIT. This environment is compatible with the Linux system and is included in many distributions of Linux. In other words it is also public domain software.
X is a complicated piece of software engineering. At the core of X is the server. The server provides to applications, services which generate windows, text and graphics. These services are independent of the actual display hardware. It also handles the input hardware, such as the keyboard and mouse. The server communicates to applications though a network connection. This has the advantage of only a single server being required to run X on multiple machines. An application in X is often called a client. The client requests resources of the server, such as windows, bitmaps, and fonts. The server notifies the client of all the events appropriate to the application's requests.
The concept of a window is naturally central to the X-system. A window basically is a space set aside within X for some purpose. An application is generally made up of numerous windows, there is a window for virtually every thing that is seen on the screen. These windows are grouped in a hierarchical order. The base window of an application is called the root or top-level window; all sub-windows are child windows.
figure 1 - Three Layer Model of X
A lot of what might be considered fundamental to a windows system to the novice, is actually not part of the X-server function. X needs a window manager to run successfully. The window manager is actually considered an application or client of X. It's main purpose is to co-ordinate the positioning of root windows. It is the window manager that allows you to iconify and position windows on the screen. It is also responsible for creating the surrounding borders of applications, which along with other subtle arrangement, create the general look and feel of the window environment. There are a number of windows managers, thus giving the user a choice of preferred environment.
For the purpose of this project X-windows may simply be considered as a three layer model, as shown in figure 1.
The programmer is shielded from the complexity of X-windows, by a library of predefined C-functions. This is called Xlib. However these Xlib functions are still quite low level, and require still some detailed knowledge of the X-window system. Fortunately there are a higher levels of procedures and functions which have been devised. These are generally called toolkits.
figure 2 - Different "Looks and Feels" (Xt, Xview, TCL/tk)
There are a number of toolkits. Each one is designed to make application programming a easier, by offering functions, which simply set-up window widgets, such as buttons, scroll bars and selection boxes. Simplify the task by removing the decision about the look and feel of the widgets. Thus there is often a tool kit associated with a window manager. Thus there is Xt intrinsic, for the basic X-window manager, Xview for Sun's openlook window manager, Motif and TCL/Tk for the Motif window manager look and feel. Examples of these are given in figure 2. Of these, the Xview toolkit and TCL/tk are both available in the public domain.
An important research area was to determine whether this project has been done already. This involved tracing the globe via. the Internet. I will admit my Internet skills are weak, but so far as I could determine there is not a similar application in the public domain. However there are some related applications.
There are a number of graphical interfaces for the mixer device of a sound card with various looks and feels. These are xmixer, xv_mixer.
The programs vrec and vplay are a command-line interfaces to VoxWare. They have a number of features that make it superior to the basic commands of the sound driver interface.
The piece of software that increased the potential of the Studio project is Sox. Sox stands for SOund eXchange. Sox is a command-line driven utility, which has two main functions;
1. to convert a sample from one format1 to another.
2. to provides some signal processing in the form of effects.
Incorporating this utility into Sox would enhance it's usefulness and flexibility;
· flexible in allowing playback and recording into all the major formats, and thus not requiring it's own format.
· useful in allowing the user to enhance the sample, by filtering or one of the other methods available.
A list of the effects and available formats is found in the User's Guide and the Sox manual page, provided in the Appendix.
Principles in User Interface design.
Designing and programming a graphical user interface (GUI) is quite a different concept to the sequential programming I have been used to. Thus some study had to go into learning how to create a useful human-computer interface (HCI).
User Interface Guidelines
The following are guidelines to creating a successful user interface2.
· Consistency in appearance, messages, the way the mouse behaves etc. A lot of the consistency points are arrived at automatically by using a toolkit.
· Offer meaningful dialogue. Make sure that when communication between the User and Machine is necessary that it is clear and that the user can respond correctly.
· Confirm actions that perform "destructive" tasks. A destructive task might be one that will wipe out all the working data.
· Be forgiving to the user. Make sure that when a user makes a mistake that the system doesn't crash. Also there should be facility to undo or reverse previous actions.
· Ensure efficient mouse movement, and keyboard operations. This is to minimise the amount of effort a user has to put in to perform tasks.
· Provide on-line help facilities, to ensure that when the user is stuck, he can come unstuck as quickly as possibly.
· Categorise activities by functions. A clear organisation of functions makes the interface more presentable and easier to learn.
To implement a graphical user interface in window environment, a few new concepts had to be learned. A GUI program is very fragmented and consists of lots of sequential programs. Each sequential program, or call-back, is then attached to an event in the graphical user interface, such as a button press. Thus the first thing that must be done in a GUI program is actually set-up the windows, menus, buttons etc. and bind a procedure to each event of interest in these windows. Common events that are used are,
· a mouse button being pressed or released.
· a combination of keystrokes.
· the mouse pointer entering or leaving the widget.
· the window being destroyed.
Once the interface has been set-up the program then enters an event loop and that is the end of the initial program sequence. This loop is actually a function of the window server. In the event loop, the operation of the GUI is suspended until an event occurs within the application. The window server then reports this event to the application and the appropriate call-back routine is executed. This process is illustrated in figure 3.
figure 3 - Graphical User Interface Implementation3
Clarifying the Specification.
Having studied the background to the problem, the next stage is to make a more detailed specification of what is wanted from the application. These shall be listed in three categories, based on the discussions on VoxWare, User Interface design and similar software.
1. Driver Capabilities.
· Play-back and recording facility using the /dev/dsp device.
· An interface to the built-in mixer.
· No real time features. Studio will only be able to make use of static sample data.
2. User considerations.
· User configuration options, such as colour.
· Ability to remove redundant sections from view, to avoid cluttering.
· Make the functionality of each object clear.
· Provide information about the sample.
· All the guidelines of creating a user interface should be observed.
3. Other useful features.
· Provide format conversion (Sox), to allow any sample to be played successfully.
· A system to allow the editing of the sample by graphical means.
· Provide copy, cut and paste editing facilities.
· A way in which to add effects to the sample (Sox).
The above three sections outline the project tasks. They also roughly represent what can be seen as three interfacing problems.
1. Interfacing the Sound Card.
2. Interfacing the User.
3. Interfacing external software.
With these things in mind it was necessary to find an appropriate design approach.
Toolkit: Xview vs. TCL/Tk.
It is generally preferred to have a design approach that is independent of implementation language. However due to my previous lack of experience in the field of GUI design, this played a major factor in the whole approach. Recall that two toolkits fitted the category of being available in the public domain, Xview and TCL/Tk.
The following will briefly discuss the strengths and weaknesses of each for this application.
Xview is a toolkit which provides C-language functions to create applications with the look and feel of the open look window manager. An Xview application is linked dynamically upon running to the Xview library of functions, which is shared with other Xview applications. Thus the executable file of the application is relatively small in size. The complexity lies within the shared Xview library. The pros and cons where found to be as follows:
+ The coding is in C, allowing close links between the User and sound card interfacing problems.
+ The interface will have the same look and feel as a standard Linux window manager.
- There is no detailed reference in the public domain, and other literature is sketchy.
- Incorporating graphics appears to require low level Xlib knowledge.
- Long compile times.
TCL/Tk is a relatively new language. It has the general look and feel of the Motif window manager. It, too, contains a C-language interface, however it comes with a scripting language known as wish, which allows interactive creation of WIMP interfaces.
The pros and cons where found to be as follows:
+ No compilation necessary.
+ Documentation provided and supporting literature is sufficiently detailed.
+ Interfacing command-line applications is simple, since the standard output of these applications can be read into TCL variables.
+ Efficiency may be improved by writing parts in C.
- Slow response times.
- The sound card interface will have to be written as a command-line application in C, thus communication will be less direct.
The decision fell in favour of TCL/Tk, mainly due to the fact that the documentation of Xview was so limited. Thus it was possible to learn this language relatively quickly. Also because of it's nature, quickly reinforced the concepts of GUI design.
This decision affected the architecture of the software. Since wish is a shell language, Studio will rely more upon the operating system calls for file manipulation. Also the sound card interface will be a stand alone application, rather than an integral function.
Choosing TCL/tk as the implementation language made the choice of design paradigm easy. TCL/Tk's interactive programming features makes a prototyping design method the natural choice. The prototyping design method is a cyclic sequence of events that refines the specification to the customer's needs, by quickly creating a prototype model and then having the customer evaluate. This method is illustrated in figure 4. Often the prototype model or models are discarded, as the specification becomes clearer. The prototype can be on paper and/or on the computer. Both were used in the Studio project.
figure 4 - Prototyping4
In this project a customer, as such, is non-existent. As a substitute, advice was given by my project supervisor and I compared Studio prototype to similar applications in DOS5, assuming that the designers of these programs have done some customer evaluation.
In addition to making the specification clearer, the prototyping method helped in learning techniques and become better acquainted with the implementation language.
First Prototype: Sound.
An initial prototype that managed to include most of the functions listed was called Sound. This went through a number of stages. The final prototype is pictured in figure 5.
This prototype was developed relatively quickly by using a computer design tool for TCL/Tk called Xf. This tool allows the visual interface to be created without difficulty since it provides an interface that shows all the widgets available and also their configuration options. It also has a library of predefined dialogue boxes and complex widgets; for example, a file browser for file entry was created with little effort.
figure 5 - The Sound Prototype.
A separate prototype dealing with the editing functions was created. This prototype was incorporated entirely into the final product.
Experimentation in obtaining the output of external applications proved successful, thus providing the services of Sox.
From these prototypes the following was learned.
1. The core functions are to be contained in interface sections within a single window. For monitors which can't show the whole application in one screen, individual sections may be switched in and out.
2. The interface is to have the general appearance of related real life hardware, e.g. the mixer interface is to be like a mixer board.
3. Complex user-computer dialogue is left hidden, apart from when needed.
4. The menu bar is to contain the majority of the functions, even duplicating the work of other interface sections, to allow keyboard short-cuts and allowing access to functions in the case of a section being hidden.
5. To draw a plot of the sound sample takes a significant amount of time. Thus to ensure better efficiency the plot is to be bounded by the limits of the canvas window.
Though the prototype was a good working model of the problem, it was necessary to scrap it, because it lacked in quality. The appearance was not well defined, the code was unintelligible (the disadvantage of using a computer design tool), and it lacked organisation.
Since this program is aimed at future modifications it is necessary that the design and structure of the program is well defined and generally should consist of reusable blocks of code or modules.
However this proves to be difficult in a graphical user interface, since data flow is difficult. The easiest way to pass information into a procedure, is through global variable. There are a few items that proved to be ideal for modularisation. Examples of these are a file-browser routine, a menu creation routine, and a dialogue box routine.
Also the above concept of modularisation takes significantly more work than just writing an application specific routine. Since time was of the essence here, there had to be some trade-off, therefore most of the routines are application specific, which of course increases the problems of maintenance and upgrade.
To compensate for lack of modularisation, I have redefined the term as an organisational method. By organising the code into groups of functions that are related, it will make the various components of Studio easier to find, and thus more maintainable.
The modules were created by dividing the functions of Studio into the following logical groups.
· File operations.
· Editing operations.
· Adding Effects.
· User options.
· User Help.
These groups later went on to be the titles of the menu items in the menu bar. Also each functional section of the Studio interface is also assigned a module responsible for creating the section widgets and setting up the call-backs to widget events.
The implementation began in earnest once the ideas for the Studio project had been solidified and proven feasible. This involved hard and often tedious coding of the design. The tedium was removed slightly by developing TCL programming techniques, which helped condense the amount of code written. Some of these techniques are,
· Using TCL lists to mass produce similar widgets.
· Using TCL arrays for global data management.
· Keeping track of state variables with automatic variable tracing.
The details of these techniques and other implementation matters are covered in the Programmer's Guide.
Testing is a vital part of software development. The purpose of testing is to uncover defects in function, in logic and in implementation6.
The basic method in which software is tested is to "break" it; that is, to try and find ways in which the software fails.
This concept of breaking the software was continuously used in the implementation process. Thus individual parts of Studio are quite robust.
The testing of Studio as a whole is yet incomplete, since the facility to perform adequate testing were not available. To provide adequate testing the following is needed,
· A sample of the users that would use Studio. These would use Studio and provide evaluation on the interface and also discover ways in which Sound Studio does not perform their requirements.
· A method for testing if the sample files are being cut and pasted accurately. This is difficult because of the use of UNIX commands. Their reliability has been assumed however is not proven. Results of certain cut and paste experiments have caused doubt in this area.
· A sample of different system configurations. Sound Studio has been tested on a 486-33, 8MbRAM, SVGA, 8 bit sound-blaster and a 386-4MbRAM, VGA, 16 bit sound-blaster Linux systems, for which Studio performed adequately. The main area of concern is different sound cards. Studio should be suitable for all systems, however that still needs to be tested.
Though the facilities were available, testing Studio in a multi-user/networked environment has not yet been performed. There has actually been little consideration given to how Studio should work within the multi-user/networked environment typical of a UNIX and X system. Thus it needs to be seen how Studio performs with multiple instances in operation. There will almost certainly have to be modifications in this area and consideration for concepts such as common clipboards must be given.
As of version 1.0, the a port to the Solaris platform has also been made available. The number of hardware conigurations tested under Linux has also increased dramatically. The Linux version is the primary one; the Solaris one is offered as-is, and cannot unfortunately be actively maintained.
Version 1.0 of Studio is Sox-12.15 compliant, and this, together with the Sox maintainer's constant help and support, has pretty much alleviated the problems associated with Sox which did not arise from the Studio interface.
In the testing that was performed the interface to Sox proved also to be of particular concern. Studio is reliant on Sox to provide effects, format conversion and sample file information, such as the sampling rate. The following are the results of testing these areas:
· The interface to Sox breaks down in some of the format conversions. Some of them e.g. .hcom can only accept a discrete set of sampling rates, a fact that Studio hasn't taken into account. The reason for this is that it was not documented in the Sox manual page.
· An area of potential breakdown is the way in which Studio has to obtain the sampling data from Sox. The data is obtained by using Sox in verbose mode and redirecting this information into a file. Studio then goes into this file and attempts to find the data it needs. The contents of this file are of the following form;
Sox: Type AUTO changed to wav
Sox: Input file: using sample rate 8000
size bytes, style unsigned, 1 channel
Sox: Input file: comment "/home/paul/blast2/sickbay.wav"
Maximum amplitude: 0.195
Minimum amplitude: 0.000
Mean amplitude: 0.006
Maximum delta: 0.141
Minimum delta: 0.000
Mean delta: 0.006
Volume adjustment: 5.120
Verbose Sox Output
It is clear that this data is not well structured and indeed in earlier attempts this method failed when reading a 16 bit files, because some extra words were added. The current version circumvented this problem and has not failed, yet!
The time performance of Studio is not superb. This was expected as a side effect of using TCL/Tk and also by using temporary files, thus requiring file I/O, which is time consuming, particularly when a large sample is involved. As a bench mark of system performance, the loading operation has been timed with files of various sizes. The results are illustrated in the plot of figure 6.
figure 6 - Performance test results.
This shows clearly the increase in delay time for loading. These delays represent an average of 0.028 seconds per byte on the 486-33 these measurements were performed on. This is not a terrible figure considering the amount of data being processed. The area of inefficiency is that in editing operations such as applying effects, cutting and pasting delay time that are comparable to the loading time. Thus there can be a lot of annoying waits for the user. Also these measurements were taken with no other applications running. Longer delay times can be expected under multitasking conditions.
The Studio project completed most of its aims, since a working piece of software has been created, that performs its function adequately. However the Studio project, like most pieces of software, is not complete. As is clear from testing there are areas in which Studio needs improvement and more rigorous testing. The following are the main areas of focus for the near future.
Preparation for distribution.
One of the Studio project aims was to return Studio to the public domain, to provide a useful service to those who require it. It also has the purpose of exposing Studio to hundreds of ruthless testers, who will either wittingly or unwittingly attempt to destroy Studio. This will allow Studio to be improved and made more robust.
However there are a number of small things that still need to be done to prepare Studio for distribution. These are,
· to provide a Make file or set-up routine, to compile and configure Studio to the user's system.
· to create a manual page for the system manual.
These are all relatively small matters, but I still need to learn how to do them.
As explained in testing, there are ways in which the interface between Sox and Studio need to be improved. The first area is in retrieving the sample information. This can be done by modifying Sox to provide a more structured output of sampling parameters.
The second area is in the sample format requirements. This can only be improved by studying the code of Sox or learning more about the different sample formats, and then reconsidering the method in which this data is obtained in Studio.
It would be useful to contact the authors of Sox to do these things.
It is clear from the section on testing that there are many areas of Studio to be tested more thoroughly. There are also number of small refinements still to be made in the user interface.
The major area that Studio could do with improvement is in reducing the number of taxing waits. This has been a continuos concern and many of the implementation decisions have been made with this in mind. The way that the plot is magnified is one example, since drawing in a TCL/Tk canvas widget is time consuming.
Attempts were made to use the C-interface tools of TCL/tk to provide added efficiency in the procedures of concern as well as providing an integrated set of sound card procedures. These attempts, initially, were successful. However when it came to adding the Tk commands the only simple way was to include the entire wish application. This seemed a bit heavy handed, since wish is a large application. Thus due to the time constraints of the project it was felt necessary to leave this to future work, since it will require additional study and perhaps just as much time again to create this C-implementation. If a C-implementation is ever attempted it may be beneficial to include the code of Sox.
Efficiency will almost certainly be improved by removing the need for temporary files. In TCL/Tk there is a mechanism by which data can be written directly into a pipe. Thus is may be possible to keep the majority of the data stored in temporary files in TCL variables and pipe the data from the variable though the external processes, such as Sox. This alone may be sufficient to reduce delays, and would be far simpler than the large task of C-coding.
In conclusion I wish to discuss some of the merits of this project that have become apparent during its course.
Initially the project seemed to be a duplication of work, since similar products are available, though not on the Linux platform. In a world where software is portable and not bound to economics, Sound Studio would not have needed to be written at this time.
A world is emerging however where software is written with portability in mind and economics is not the master. This is the world of the "information superhighway". In this world the need for Sound Studio is apparent. Linux as a child of the Internet is increasing in popularity, perhaps because it has all that is required for adequate networking. But Linux is still a fledgling in many ways, and by creating Sound Studio it is hoped that the strength of Linux' position will be increased.
Sound Studio also shows how an application can take on some complicated functions without much sophisticated programming. Sound Studio could never achieved its complexity, within the time scale, without ready made components, particularly Sox. Although sound card applications were written as part of the Studio project, it may have been possible to incorporate ready-made programs such as vrec. This characteristic of Sound Studio should make it very extensible to UNIX systems other than Linux, providing a command-line sound card application is available.
The personal benefits have been a solid introduction into UNIX and X-windows and graphical user interface programming. It has also allowed me to learn a language that will allow me to create simple WIMPs easily and quickly, for any future applications.
The true value of this project can only come by Studio actually being used. This I truly hope will happen and hope that Sound Studio will make life easier for a large number of people across the globe.